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Preface 



The Fast Software Encryption Workshop 1999 is the sixth in a series of workshops 
starting in Cambridge in December 1993. 

The workshop was organized by General Chair William Wolfowicz, Fonda- 
zione U. Bordoni, and Programme Chair Lars Knudsen, University of Bergen, 
Norway, in cooperation with Securteam, as far as local arrangements were con- 
cerned. The workshop was held March 24-26, 1999 in Rome, Italy. 

The workshop concentrated on all aspects of fast secret key ciphers, inclu- 
ding the design and cryptanalysis of block and stream ciphers, as well as hash 
functions. 

There were 51 submissions, all of them submitted electronically. One sub- 
mission was later withdrawn by the authors, and 22 papers were selected for 
presentation. All submissions were carefully reviewed by at least 4 committee 
members. At the workshop, preliminary versions of all 22 papers were distribu- 
ted to all attendees. After the workshop there was a final reviewing process with 
additional comments to the authors. 

It has been a challenge for me to chair the committee of this workshop, and it 
is a pleasure to thank all the members of the programme committee for their hard 
work. The committee this year consisted of, in alphabetic order, Ross Ander- 
son (Cambridge, UK), Eli Biham (Technion, Israel), Don Coppersmith (IBM, 
USA), Cunsheng Ding (Singapore), Dieter Gollmann (Microsoft, UK), James 
Massey (Denmark), Mitsuru Matsui (Mitsubishi, Japan), Bart Preneel (K.U. 
Leuven, Belgium), Bruce Schneier (Counterpane, USA), and Serge Vaudenay 
(ENS, France). 

It is a great pleasure to thank William Wolfowicz for organising the workshop. 
Also, it is a pleasure to thank Securteam for the logistics and Telsy and Sun for 
supporting the conference. Finally, a big thank you to all submitting authors for 
their contributions, and to all attendees (approximately 165) of the workshop. 
Finally, I would like to thank Vincent Rijmen for his technical assistance in 
preparing these proceedings. 



April 1999 



Lars Knudsen 
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Abstract. RC6 has been submitted as a candidate for the Advanced 
Encryption Standard (AES). Two important features of RC6 that were 
absent from its predecessor RC5 are a quadratic function and a fixed 
rotation. By examining simplified variants that omit these features we 
clarify their essential contribution to the overall security of RC6. 



1 Introduction 

RC6 is an evolutionary improvement of the block cipher RC5 0 that was de- 
signed to meet the requirements of the Advanced Encryption Standard (AES). 
Like RC5, RC6 makes essential use of data-dependent rotations, but it also in- 
cludes new features such as the use of four working registers instead of two, and 
the inclusion of integer multiplication as an additional primitive operation. Two 
components of RC6 that were absent from RC5 are a quadratic function to mix 
bits in a word more effectively and a fixed rotation that is used both to hinder the 
construction of good differentials and linear approximations and also to ensure 
that subsequent data dependent rotation amounts are more likely to be affected 
by any ongoing avalanche of change. 

An initial analysis of the security of RC6 and its resistance to the basic 
forms of differential and linear cryptanalysis was given in Pj. Here we further 
illustrate how these new operations contribute to the security of RC6 by studying 
simplified variants (that is, intentionally weakened forms) of RC6. In particular, 
our approach is to find the best attack on the weakened forms and then try to 
adapt the attack to the full cipher. Since one of the design principles of RC6 
was to build on the experience gained with RC5, the focus of our analysis will 
be in assessing the relevance to RC6 of the best existing cryptanalytic attacks 
on RC5. We will often refer to the work of Knudsen and Meier 0 and that 
of Biryukov and Kushilevitz |2|. These authors in particular have made very 
significant advances in understanding the security of RC5. 

Our work splits naturally into two parts. The first focuses on the usefulness of 
the fixed rotation and the second on the quadratic function. While our analysis 
is targeted at RC6 and its simplified variants, some of the results might well be 
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of independent interest. Our analysis starts by considering some of the weakened 
variants of RC6 that were introduced in P| . More specifically, by dropping the 
fixed rotation we derive a cipher that we will denote by RC6-NFR (where NFR 
stands for no fixed rotation), by dropping the quadratic function we obtain RC6- 

1 (where I stands for the identity function), and by dropping both operations we 
have RC6-I-NFR. 

We will consider characteristics and differentials for RC6-I-NFR and RC6- 
NFR that have already been described in |3|. We study the relations between 
certain values of the subkeys and the probability of a characteristic and/or dif- 
ferential. Such phenomena are similar to the “differentially-weak keys” of RC5 
observed by Knudsen and Meier |Bj . We describe our observations and provide a 
thorough analysis which suggests that inclusion of the fixed rotation destroys the 
structure required for such dependencies to form. As a consequence RC6-I and 
RC6 itself seem to be immune from any direct extension of the results previously 
obtained on RC5. 

Second, we examine the diffusive properties of the quadratic function and 
other operations that are used in RC6. In this analysis we track the Hamming 
weight (the number of I’s) of the exclusive-or difference between two quantities as 
they are encrypted. Quite naturally this leads to the idea of differentials that are 
constructed using such a measure of difference and this notion is very similar 
in spirit to earlier work on RC5 m We show that the quadratic function 
drastically increases the Hamming weight of some input difference when the 
Hamming weight of an input difference is small. This indicates that the use 
of both the quadratic function and data-dependent rotations in RC6 make it 
unlikely that differential attacks similar to those that were useful for RC5 m 
can be effectively extended to RC6. 

2 Description of RC6 and Variants 

A version of RC6 is specified as KC6-w/r/b where the word size is w bits, en- 
cryption consists of a nonnegative number of rounds r, and b denotes the length 
of the encryption key in bytes. Throughout this paper we will set w = 32, r = 20, 
b = 16, 24, or 32 and we will use RC6 to refer to this particular version. The 
base-two logarithm of w will be denoted by Ig w and RC6 uses the following six 
basic operations: 

a + b integer addition modulo 2“ 
a — b integer subtraction modulo 2“’ 
a © 6 bitwise exclusive-or of w-bit words 
a X b integer multiplication modulo 2^" 
a b rotate the w-bit word a to the left by the amount 
given by the least significant Ig w bits of b 
a ^ b rotate the w-bit word a to the right by the amount 
given by the least significant Ig w bits of b 

The user supplies a key of length k bytes which is then expanded to a set 
of subkeys. The key schedule of RC6 is described in m- Since here we are 
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only concerned with encryption, we will assume that the subkeys S'[0], . . S' [43] 
are independent and chosen at random. RC6 works with four ui-bit registers 
A, R, C, D which contain the initial input plaintext as well as the output cipher- 
text at the end of encryption. We use {A, B,C, D) = (B,C,D,A) to mean the 
parallel assignment of values on the right to registers on the left. 

Encryption with RC6-w/20/6 

Input: Plaintext stored in four ic-bit input registers A, B, C, D 

w-bit round keys S[0, . . . , 43] 

Output: Ciphertext stored in A, B, C, D 

Procedure: B = B + S[0] 

D = D + S[l] 

for i = 1 to 20 do 

{ 

t={Bx {2B -I- 1)) ^ Ig w 
u = {D X {2D -I- 1)) Ig w; 

A= ((A©t)«;M)-hS[2i] 

C = ((C © u) ^ -k S[2i + 1] 

{A,B,C,D) = {B,C,D,A) 

} 

A = A + S[42] 

C = C + S[43] 

The three simplified variants of RC6 that we will consider throughout the 
paper are distinguished from RC6 in the way the values of t and u are assigned. 
These differences are summarized in the following table. 



The assignment of t and u in RC6 and some weakened variants 














RC6-I-NFR 


RC6-I 


RC6-NFR 


RC6 


t = 


B 


B ^\gw 


B X {2B + 1) 


{B X {2B + 1)) ^Igic 


u = 


D 


D^lgw 


D X {2D + 1) 


{D X {2D + 1)) ^Igtc 



3 The Fixed Rotation 

In [B| Knudsen and Meier show that the values of some of the subkeys in RC5 
can have a direct effect on the probability of whether some differential holds. In 
this section we show that a similar phenomenon can be observed in weakened 
variants of RC6 that do not use the fixed rotation. This should perhaps come as 
little surprise since while the structure of RC6-I-NFR is very different to that of 
RC5, it uses the same operations and might be expected to have similar behavior 
at times. We will then consider the role of the fixed rotation used in RC6 and we 
will demonstrate by analysis and experimentation that the effects seen in RC5 
and some simplified variants of RC6 do not seem to exist within RC6 itself. 
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3.1 Existing Analysis on RC6-I-NFR and RC6-NFR 

In |E] one potentially useful six-round iterative characteristic was provided for 
attacking both RC6-I-NFR and RC6-NFR. This is given in Table Q Here Ct is 
used to denote the 32-bit word that has all bits set to zero except bit t where 
t = 0 for the least significant bit. We use Ai (respectively Bi, Ci and Di) to 
denote the values of registers A (respectively B, C, and D) at the beginning of 
round i. As an example, A\, B\, C\, and D\ contain the plaintext input after 
pre-whitening and for the six-round variants of the cipher, A^, B^, and Dy 
contain the output prior to post- whitening. According to [3j, when averaged 
over all possible subkeys, the expected probability that this characteristic holds 
is 2-30 foj. RC6-I-NFR and RC6-I. 

3.2 Refined Analysis of RC6-I-NFR and RC6-NFR 

Closer analysis of the characteristic probabilities for RC6-I-NFR and RC6-NFR 
suggests that the values of some of the subkeys during encryption are important. 
In particular, the characteristic of interest for RC6-I-NFR and RC6-NFR given 
in Table n can only occur if certain subkey conditions are met. Further, once 
these subkey conditions hold then the characteristic occurs with probability 2 “^o^ 
which is much higher than the initial estimate of 2“ 3° that was obtained by 
averaging over all subkeys. 



i 


Ai Bi Ci Di 


1 


631 631 0 0 




i 


2 


631 0 0 0 




; 


3 


0 0 0 631 






4 


0 631 631 0 




h 


5 


631 631 0 631 






6 


631 631 631 0 






7 


631 631 0 0 



Table 1. A characteristic for RC6-I-NFR and RC6-NFR. 



In the analysis that follows we will concentrate on RC6-NFR. The same 
arguments and results can be applied to RC6-I-NFR by replacing f{x) = x x 
(2x + 1) with the identity function f{x) = x. We will use the fact that x mod 2* 
uniquely determines {xx (2a; -1-1)) mod 2*. Furthermore, the notation “= 32 ” will 
be used to indicate when two values are congruent modulo 32. 
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Lemma 1. If the characteristic given in TaWed holds for RC6-NFR, then the 
following two conditions on the subkeys must hold: 

/(-^[9]) =32 -5[7], 

/(^[ 8 ]) =32 

Proof First we observe that if the characteristic is to hold, then certain rotation 
amounts derived from the B and D registers must be zero. Note that we always 
have that Bi = and that Di = Ci+i. As a consequence, for the characteristic 
to hold we must have 

D2 =32 C3 =32 0, B3 =32 A4 =32 0, 

B4 =32 A3 =32 0, D4 =32 C 5 =32 0, 

B5 =32 Aq =32 0, Bq =32 Ay =32 0. 

Using the fact that the rotation amounts are 0, we get the following two 
equations from rounds three and four and rounds four and five. 

B4 = (C3(Bf(D3)) + Sl7], ( 1 ) 

B5 = (C4ef(B4)) + S[9]. (2) 

Since B4 =32 0, C3 =32 0, B3 =32 0 and D4 =32 0, we have S'[7] =32 -/(L> 3 ) 
and C4 =32 —S' [9]. Since C4 = D3, we obtain the first condition on subkeys 
5[7] =32 -/(-^[9]). 

Similarly, looking at the computation from rounds four and five and rounds 
five and six, we get the following two equations. 

D3 = A4®f{B4) + S[8], (3) 

Be = C3(Bf{D3) + S[n]. (4) 

Since A4 =32 0, B4 =32 0, Bq =32 0 and C3 =32 0, we have D3 =32 S'p] and 
^[ 11 ] =32 -/p 5 ), and so 5[11] =32 -/(^[ 8 ]). □ 

The subkey dependencies in Lemma d were obtained using only four equati- 
ons (those for B4, B3, D3 and Bq). In total, one could write down 12 equa- 
tions of the form = (((Q © /(A))<^/(7?i)) + S[2i + 1] and A-i-i = 

{{{Ai © f{Bi))<m:f{Di)) + S'pi] for this characteristic. Although there might 
be dependencies involving other equations, the four given above will be the fo- 
cus of the rest of this section. Essentially, each equation involves four variables 
and the aim is to combine equations to obtain two expressions with a single 
variable. If the two expressions involve the same variable then we can obtain 
conditions on the subkeys involved. The four equations we use are the only ones 
from the set of twelve that allow us to do this. 

It is worth noting that given such conditions on the subkeys involved not 
only does the characteristic hold, but it does so with a higher probability than 
the expected value given in 0. 

Lemma 2. Assume that the characteristic given in Table Q holds up to round 
five. Furthermore suppose that /(— Sp]) =32 — <S'[7] and /(S'[ 8 ]) =32 — S'[ll]. 
Then B3 =32 0 and Bq =32 0. 
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Proof. From LemmaQ] we have that ^[T] =32 — /(D3). This is equivalent to 
-s[7] =32 /(C4). Also, we have that B5 =32 C'4 + 5 [ 9 ], So, if -^[ 7 ] =32 f{-S[ 9 ]) 
then /(C'4) =32 /(— iSp]) which implies that C4 =32 — -Sp] and so =32 0 . A 
similar argument can be used to show that Bq =32 0. □ 

Lemma El shows that when the subkey conditions hold, B5 =32 0 and Bq =32 
0 . In this case the probability of the characteristic will be x 2 ® x 2 ^ = 2 “^°, 
since two of the rotation amounts are always zero. Recall that the estimated 
probability for the characteristic when averaged over all keys is 2 “^° | 3 |. Here we 
have shown ( Lemmas Q and |2I) that there is some irregularity in the distribution 
of the probability: For a fraction of 2 “^^^ keys the probability is 2 “^*^, and for the 
rest of the keys the probability is much smaller than 2 “^*^. This kind of irregular 
distribution can sometimes be exploited as was demonstrated by Knudsen and 
Meier with RC 5 0 who showed some techniques for using it in a differential 
attack. We would expect the same to apply here. Similar subkey dependencies 
can be observed for some of the other characteristics for RC6-I-NFR and RC6- 
NFR given in P| . However in some cases the characteristic must be iterated more 
than once before dependencies exist. 

Note that the behavior of the differential associated with some characteristic 
is typically of more importance in a differential attack. For RC6-I-NFR, while the 
characteristic displays the irregular behavior already described, the associated 
differential has been experimentally verified to hold with the expected probabi- 
lity 0. However the associated differential for RC6-NFR appears to have the 
same irregular behavior as the characteristic. Why is there this discrepancy? In 
0 it is shown how the introduction of the quadratic function helps to reduce the 
additional effect of differentials. In short, for RC6-I-NFR there are many equally 
viable paths that match the beginning and end-points of the characteristic. If the 
characteristic fails to hold because of some choice of subkey values, other cha- 
racteristics hold instead thereby maintaining the probability of the differential. 
However, with RC6-NFR we introduce the quadratic function and this typically 
reduces differentials to being dominated by the action of a single characteristic. 
Irregular behavior in the characteristic will therefore manifest itself as irregular 
behavior in the differential. 

3.3 Differential Characteristics in RC6-I and RC6 

Let us now consider the role of the fixed rotation that was omitted in RC6-I- 
NFR and RC6-NFR. We will find that this single operation removes the kind of 
subkey dependencies that occurred in these two variants. 

We will focus on RC6-I in the analysis for simplicity, and the same arguments 
also apply to the full RC6. We will need to make some heuristic assumptions to 
make headway with our analysis. Nevertheless our experimental results confirm 
that the differential behavior of RC6-I is pretty much as expected. It also closely 
matches the behavior described in 0. 

Consider the characteristic given in Table El This is the characteristic which 
seemed to be one of the most useful for attacking RC6-I | 2 | . We first argue that 
there are no subkey dependencies of the form we described in Section for 
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this characteristic and we then broaden our discussion to include other, more 
general, characteristics. 



% 


A.i Bi Ci Di 


1 


ei6 eii 0 0 




1 


2 


eii 0 0 0 






3 


0 0 0 626 




i 


4 


0 626 626 0 






5 


626 621 0 626 




i 


6 


621 616 626 0 






7 


616 611 0 0 



Table 2 . A useful characteristic for RC6-I. 



At this stage we need some new notation and the exponent n will be used to 
denote when some quantity has been rotated to the left by n bit positions. For 
example, =32 15 means that when D 2 is rotated five bits to the left, then the 
decimal value of the least significant five bits is 15. Of course, this is the same 
as saying that the most significant five bits of £>2 take the value 15. 

For simplicity, we will assume that (x + yy = where j denotes a 

rotation amount. This is true if, and only if, there is no carry-out when adding 
the top j bits and no carry-out when adding the bottom 32 — j bits. For the 
sake of our analysis however we make this assumption, since it should actually 
facilitate the construction of any potential subkey dependencies! 

Following the arguments in Lemma Q], for the characteristic in Table 0 to 
hold the following rotation amounts must take the values indicated: 

=32 Cl =32 15, Bl =32 =32 27, 

Bl =32 =32 27, Dl =32 =32 27, 

B! =32 ^6 =32 17 , B! =32 =32 17. 

We wish to write down four equations similar to Equations 
and which cause subkey dependencies in RC 6 -NFR. From round three to four, 
the difference 625 is copied from register H 3 , is changed to 631 by the action of the 
fixed rotation, and then exclusive-ored into the C strand. For it to become the 
626 that appears in R 4 , the data dependent rotation must have the value 27. 
Hence, we must have =32 27 and £4 = (C 3 © £> 3 )^^ -I- S'[7] = ® D^ + S\7\. 
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In a similar way other equations can be derived: 



B^ = Cf ®D 3 + S[7], (5) 

B5 = Cf ®D4 + S[9], ( 6 ) 

D 5 =Af(BB 4 + S[ 8 ], (7) 

Be = Cf (B Dl'^ + S[ll]. (8) 



In Lemma n we observed a subkey dependency by combining the analogous 
equations to o and (0 , and another dependency from combining the analogous 
equations to (D and 0 • In the case of RC6-I we can demonstrate that neither 
approach now works. 

We first consider Equations o and (0. For Equation 0 we know that the 
values of mod 32, mod 32, and mod 32 are fixed. This implies a 
condition on the least significant five bits of C4. Since C4 is the same as D3, 
we have a condition on D 3 mod 32. We now have conditions on all the registers 
in Equation 0, namely, S| mod 32, C| mod 32, and D 3 mod 32. However the 
bits from different words involved in this equation are from different positions. 
They don’t lead to any constraints on S'[9], and there appear to be no subkey 
dependencies as a result. 

Similarly arguments also apply to Equations (0 and 0 . One may also try 
to combine Equations o and CD, since they have the quantity B4 in common, 
or Equations 0 and 0 , since they have C 5 = D 4 in common. However, these 
combinations once again fail to give any subkey dependencies. 

We performed experiments on RC6-I to assess the probability of the charac- 
teristics given in Table 0 These results confirmed that the distribution of the 
characteristic probability was as expected, and there was no indication of any 
subkey dependencies for the characteristic. 



i 


Ai 


Bi 


Ci 


Di 


1 


et+5 


et 

; 


0 


0 


2 


et 


0 

; 


0 


0 


3 


0 


0 

; 


0 


6s 


4 


0 


e„ 


ea 


0 


5 




^u—5 


0 




6 


5 


Cu-lO 


ev 


0 


7 


Cu-lO 


^u—15 


0 


0 



Table 3. A generalized characteristic for RC6-I. 
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More generally, we might consider characteristics of the form given in Table 0 
The values which we need to fix if the characteristic is going to hold are 

D\ =32 C'l =32 S — t, -B| =32 =32 U — 5 — S, 

B 4 =32 A^ =32 U — 5 — S, D\ =32 C| =32 V — U — 5, 

=32 ^6 =32 u — 15 — V, Bq =32 A® =32 U — 15 — V. 

Let ri = u — 5 — s, T2 = v — u — 5, and r 3 = u — 15 — v. Then the subkey 

dependencies we observed would be produced by the following equations: 

^4 = (B + S[7], 

B5 = ® + S[9], 

D5 = + ^[ 8 ], 

Be = B + S[ll]. 

Following similar arguments to those presented earlier, it can be verified that 
there is no choice for ri, r 2 , and that makes the characteristic depend upon 
the values of the subkeys. In particular, the most promising values to try are 
ri = 0; ri = 27; r 3 = 0 and V 2 = 22; and rs = 0, V 2 = 27, and ri = 27. 

The fixed rotation is an important component of RC6. Not only does it help 
to hinder the construction of good differentials and linear approximations |3] 
but it helps to disturb the build-up of any inter-round dependencies. Here the 
fixed rotation ensures that equations can simultaneously hold without forcing 
any restriction on the values of the quantities involved. 



4 The Quadratic Function 

In this section, we examine the diffusive properties of the quadratic function 
and other operations used in RC6. Both the work of Knudsen and Meier 0 and 
that of Biryukov and Kushilevitz |5| rely on the following fact about RC5: It 
has a relatively slow avalanche of change from one round to the next, unless 
the difference in two words is in the bits used to determine a data-dependent 
rotation. When that happens, the amount of change in one round to the other 
can be dramatic, but until then the rate of change tends to be rather modest. 
This can be exploited to a limited degree in attacks on RC5 |2l8j . 

We will choose a measure of diffusion that complements naturally the work 
given in PEI- We will use the Hamming weight of the exclusive-or difference 
between two words as a measure of the difference, rather than the actual value of 
the difference as we would in differential cryptanalysis [IJ or part of the difference 
as we would in truncated differential cryptanalysis [Z]. It is straightforward to 
envisage using this notion of difference in a differential-style attack, something we 
call Hamming weight differentials, and this is very similar to some of the earlier 
analysis of RC5 m While this earlier work focused on how to effectively use 
such differentials to attack RC5, the focus of our work will be on assessing the 
likely impact of the quadratic function in thwarting such attacks. 
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Even for a simple operation it can be difficult to fully characterize the pro- 
bability distribution of the Hamming weight of some output difference given the 
Hamming weight of the input differences. We will study the problem by ana- 
lyzing the expected Hamming weight of such an output difference and it turns 
out that such an approach provides a good insight into the role of the different 
operations. 

Our analysis shows that the quadratic function drastically increases the Ham- 
ming weight of some difference especially when the Hamming weight of the input 
difference is small. This illustrates a nice effect whereby the use of the quadratic 
function complements that of the data-dependent rotation. As we have men- 
tioned, the data-dependent rotation becomes an effective agent of change only 
when there is a difference in the rotation amount. With a small Hamming weight 
difference, it is less likely that non-zero difference bits appear in positions that 
affect a rotation amount. However, the quadratic function helps to drastically 
increase the avalanche of change so that the full benefit of the data-dependent 
rotations can be gained as soon as possible. 

4.1 Definitions and Assumptions 

We introduce some useful notation and definitions. For a zc-bit binary vector X, 
let |A| denote the Hamming weight of X, i.e., |A| is the number of I’s in X. 
Throughout this paper we will be continually referring to RC6 and so we will 
assume that the word size w = 2>2. We will let X' = X\(B X 2 , T' = Id © I 2 , and 
Z' = Zi® Z 2 and we use x, y, z to denote the Hamming weight of the differences 
|A'|, |y'|, |Z'|, respectively. 

Let us consider the following two conditions that may be imposed on some 
difference that has Hamming weight x. 

A: There is a single block of consecutive I’s of length x, and the block is 
distributed randomly at some position in the input difference. 

B: There are t > 1 blocks of consecutive I’s of length xi,X 2 , ■■■,xt such that 
Xi + X2 ~\ — - + xt = X. In addition, each block is distributed randomly across 
the input difference. 

Condition B is actually a good characterization for the differences in the inter- 
mediate rounds of RC6 and its variants. In each round (of RC6 or its variants) 
any difference in the A and C strands are rotated by a random amount due to 
the data-dependent rotations. Hence each block of I’s within the differences is 
distributed randomly. Condition A is a special case of Condition B. In the next 
two sections when we examine the diffusive properties of individual operations, 
we will first consider the special case Condition A and then generalize the results 
to Condition B. 

4.2 Diffusive Properties of the Basic Operations 

Here we analyze the basic operations of exclusive-or, addition, and rotation. The 
more complicated quadratic function will be considered in the next section. 
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Lemma 3. (exclusive-or) For i = 1,2 let Zi = Xi(B Yi- If X' and Y' satisfy 
Condition A, then E{z) = x + y — 

Proof. Since the block of I’s in X' and Y' is distributed randomly, each bit “1” in 
X' overlaps with each bit “1” in Y' with probability So the expected length 
of overlap in the output difference is implying that the expected Hamming 
weight of the output is x + y — □ 

Corollary 4. (exclusive-or) For i = 1,2 let Zi = If X' and Y' satisfy 

Condition B then E{z) = x + y — 

Proof. Follows directly from the proof of Lemma 0 

Note that the expected overlap between the quantities X' and Y' is similar to 
the number of “corrections” used by Biryukov and Kushilevitz in their analysis of 
eorreeted Fihonaeei sequenees |2|- There an explicit formula was not provided |2j 
but all sequences with a “reasonable” number of corrections were experimentally 
generated and this was used as an estimate in their work. 

Lemma 5. (addition) For i = 1,2 let Zi = Xi + S, where S is the subkey. If X' 
satisfies Condition A then averaging over all possible Xi,X 2 , S, E{z) = c + 
where c G [0, 1] and depends on X' . 

Proof. We start with the special case where \X'\ = w, that is, X\ and X 2 differ 
in all bits. We first prove that when averaging over all possible X^ , X 2 , S, 

pro6(Xi + 5'<2’" andX 2 + 5'>2“')= J. (9) 

Given any X\ G {0, 1}’", we define 

d(Xi) = \S : S'G {0,ir,s.t. Xi ©S' < 2^" and Xa + S'© 2“|. 

If Xi < 2“-i, we have d{Xi) = X 2 -X 1 = (Xi©(2“-1))-Xi = 2“'-l-2Xi. 
(If Xi > 2“-i, d(Xi) = 0.) Hence, 

prob{Xi + S' < 2^" and X 2 + S' > 2^") = ^ ^ 

^ ^ > 2“ X 2^" 4 

Note that for Equation M the particular value of w is unimportant. So we can 
consider the least significant j bits of Xi,Xa,S. More precisely, for 1 < j < w, 
define Xi(j) = Xi mod 2-^, X 2 (j) = X 2 mod 2-1, S(j) = S mod 2F Then, 

prob{Xi{j) + S(j) < 2^ and X 2 (j) + S(j) > V) = J. (10) 

By symmetry, 

prob{Xi{j) + S(j) > 2^ and X 2 (j) + S(j) < 2^) = ^. 



( 11 ) 
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From Equations cni and m we know that with probability 1/2, exactly one of 
the two addition operations {X\ + S and X 2 + S) produces a carry into bit j. If 
this happens, Zi and Z 2 will be the same in bit j. Therefore, with probability 
1/2, the bit (j > 1) of Z' = Zi(B Z 2 is 1. Since bit 0 of Z' is always 1, the 
expected Hamming weight of Z' is + 1 = = c + for c = 0. We have 

proved the Lemma for the special case where \X'\ = w. 

Let us now consider the general case where |X'| = x for some 1 < a; < rc. 
Let V be the index of the most significant 1 in X'. So Xi and X 2 are the same 
in bits u + 1 through w — 1. When computing Zi = Xi + S and Z 2 = X 2 + S, 
it is possible that one or both of the carries will propagate into bits u + 1 and 
higher. It is not hard to show that the “extra” number of bit differences between 
Zi and Z 2 due to this carry effect has an expectation c for some 0 < c < 1. So 
the expected Hamming weight of the output difference is c + . □ 

Corollary 6. (addition) For i = 1,2 let Zi = Xi + S, where S is the subkey. 
Suppose that X' satisfies Condition B and there are t blocks ofl’s in X' . Then 
averaging over all possible keys S, E{z) < t . 

Proof. Follows from Lemma 0 □ 

The fixed rotation Z — X ^ Ig w always preserves the Hamming weight of 
the input difference in the output difference. For data-dependent rotations, it 
is straightforward to see that provided the input difference does not affect the 
rotation amount, then the Hamming weight of the difference is preserved. We 
can state this simple fact in the following lemma. 

Lemma 7. (data-dependent rotation) For i = 1,2 let Zi = Xi^Yi. If 
Y' =w 0, then z = x. 

The more interesting case is when Y' 0. It has previously been shown ^ 
El that once a difference in the amount of rotation is experienced then the output 
difference is distributed in an essentially random manner over a very large set. 
This essentially makes any differential-style attack impossible since in this case 
there is a very substantial diffusive effect. So depending on the difference Y', a 
data-dependent rotation can either preserve the Hamming weight or increase the 
Hamming weight by a significant amount. The probability of the latter case oc- 
curring is closely related to the Hamming weight of Y' and we have the following 
lemma that characterizes such a relation for the special case. 

Lemma 8. Let y = \Y'\ and let p be the probability that Y' 0. IfY' satisfies 
Condition A then p = min . 

For the more general case when Y' satisfies Condition B it is not so simple to 
derive a precise formula similar to the one given above. However it is clearly the 
case that the heavier the Hamming weight of Y' , the larger the probability that 
some part of the non-zero input difference will have an effect on the rotation 
amount. 
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4.3 Diffusive Properties of the Quadratic Function 

Here we consider the diffusive properties of the quadratic function Z = f{X), an 
important new operation in RC6. First, we restate a lemma regarding the qua- 
dratic function that first appeared in . This lemma characterizes the behavior 
of the output when a single bit of some input is flipped. 

Lemma 9. PJ Given an input Xi chosen uniformly at random from {0, 1}^^, 
let Qij denote the probability that flipping biti of X\ will flip bit j of Z\ = f(Xi). 
Then, 



{ 0 for j < i, 

1 for j = i, 

1 for j = 1 and i = 0, and 
Pij S [1/4, 3/4] for j > i>l or j >2 and i = 0. 

For the last case, gij is close to 3/4 if j = 2i + 2, and for most of the other i,j 
pairs gij is close to 1/2. 

Put descriptively this lemma shows that flipping bit i of some input X will 
always flip bit i of the output and will, in most cases, also flip bit j where j > i 
of the output with probability around 1/2. 

We can extend the lemma to the more general case where multiple bits of 
the input are flipped and we obtain a similar result: Let i be the bit position 
of the least significant 1 in X' . Then flipping bit i of the input X\ will always 
flip bit i of the output and will, in most cases, flip bit j for j > i oi the output 
with probability around 1/2. Experiments confirm both this intuition and also 
the following, perhaps surprising, result. 

Lemma 10. (quadratic function) For i = 1,2 let Zi = f{Xi). Let x = jX'j 
and z = \Z'\. If X' satisfies Condition A then E{z) Ri 1 -|- ¥+2^=1^ 

Proof. Let i be the index of the least significant 1 in X' . For a fixed i, the 
expected value of z is roughly 1+ {w — 1 — i) /2. li X' satisfies Condition A then 
i is uniformly distributed between 0 and {w — x). Hence, 



E{z) 



{w 



1 

a;) -I- 1 



W — X 



E 





i-f 



X + w — 2 
4 



□ 



Corollary 11. (quadratic function) Fori =1,2 let Zi = f{Xi). Let x = jX'j 
and z = jZ'j. If X' satisfies Condition B and there are t blocks of I’s in X' , 
then E{z) > 1 ^+^+*-3 . 

Proof. Similar to the proof of Lemma II i)l □ 

LemmaEI shows that even when the difference in some input to the quadratic 
function has Hamming weight 1, the average Hamming weight of the difference in 
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the output is 8.75. This is a very important result. All the other basic operations 
in RC6, as well as those used in RC5, generally provide little or no additional 
change to the output difference if the Hamming weight of the input difference is 
very low. 

We can illustrate the effect of including the quadratic function in the following 
way. We experimentally measure the probability that the rotation amounttQ at 
the end of a given number of rounds are unaffected by a single bit change in the 
first word of the input to the cipher. We consider rotation amounts in this exercise 
because current differential-style attacks on RC5 and RC6 require any difference 
propagating through the cipher to leave the rotation amounts unchanged. We 
use to indicate that experimentally the probability is approximately 
which is indistinguishable from random noise. 



Rounds 


RC6-I-NFR 


RC6-I 


RC6-NFR 


RC6 


2 


2 — 0.54 


2 — 0.64 




2 - 10.27 


4 


2 - 2.15 


2 - 2.45 


2 - 6.27 


- 


6 


2 - 6.14 


2 - 7.04 


2 - 14.30 


- 


8 


2 - 12.76 


2 - 14.97 


- 


- 


10 


2 - 19.07 


- 


- 


- 



For an increased number of rounds, the probability of unchanged rotation 
amounts gives a good illustration of the relative diffusive effect of RC6 and its 
weakened variants. It also illustrates the role of the quadratic function in the 
security of RC6. 

Basic differential-style attacks attempt to predict and control the change 
from one round to the next during encryption |^. Improved attacks on RC5 |21 
1^ do not attempt to predict the difference quite so closely. Instead, they rely on 
the relatively slow diffusive effect of RC5 to ensure that any change propagating 
through the cipher remains manageable and to some extent predictable. Even 
though single-bit starting differences might be used, differentials with an ending 
difference of Hamming weight 15, for example, can still be useful 0H|. 

The quadratic function was added to RC6 to address this particular short- 
coming of RC5 and our work suggests that the quadratic function is likely to 
hinder attacks that rely on a modest avalanche of change from one round to the 
next. 

5 Conclusions 

In this paper we have considered the role of two operations in RC6 that diffe- 
rentiate it from RC5. Both operations are essential to the security of RC6. It 
is interesting to observe that RC6-I-NFR, a simplified variant of RC6 without 
either of these operations, has some of the behavior of RC5. RC6-FNFR tends 

^ By “rotation amounts” we mean the low five bits of the registers for RC6-I-NFR 
and RC6-NFR, the high five bits of the registers for RC6-I, and the high five bits of 
the output of f{x) for RC6. 
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to have a slow rate of diffusion thereby potentially providing opportunities to 
mount differential attacks similar to those described for RC5 |2I8) . Further, RC6- 
I-NFR demonstrates some of the differentially-weak key phenomena that has also 
been observed in RC5 |S|. The introduction of both the fixed rotation and the 
quadratic function makes RC6 resistant to such shortcomings. 

We stress the importance of simplicity when designing a cipher. Unnecessary 
complexity makes it hard to perform a systematic examination of the true se- 
curity offered. By contrast, the exceptional simplicity of RC5 invites others to 
assess its security. This tradition continues with RC6 with a design that encou- 
rages the researcher and aims to facilitate a deep understanding of the cipher. 
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Abstract. In this paper we evaluate the resistance of the block cipher 
RC5 against linear cryptanalysis. We describe a known plaintext attack 
that can break RC5-32 (blocksize 64) with 10 rounds and RC5-64 (block- 
size 128) with 15 rounds. In order to do this we use techniques related 
to the use of multiple linear approximations. Furthermore the success of 
the attack is largely based on the linear hull-effect. To our knowledge, at 
this moment these are the best known plaintext attacks on RC5, which 
have negligible storage requirements and do not make any assumption 
on the plaintext distribution. Furthermore we discuss the impact of our 
attacking method on the AES-candidate RC6, whose design was based 
on RC5. 



1 Introduction 

The iterated block cipher RC5 was introduced by Rivest in lEEnSI. It has a 
variable number of rounds denoted with r and key size of h bytes. The design 
is word-oriented for word sizes w = 32,64 and the block size is 2w. The choice 
of parameters is usually denoted by RC5-W, RC5-w/r, or RC5-w/r/6. Currently 
RC5-32/16 is recommended to give sufficient resistance against linear and diffe- 
rential attacks pY98) . 

RC5 has been analyzed intensively. For an overview we refer to the report 
by Kaliski and Yin |KY98j . Currently the best published attack can be found 
in P3K98| . There a chosen plaintext attack is described for which we summarize 
the complexities for different round versions of RCf^ in the second column of 
Table [D As this is a differential attack, it yields a known plaintext attack for 
a larger amount of known plaintexts We give the estimated required 

amount of known plaintexts in the third column of Table Q The known plaintext 
attack however needs a storage capacity for all the required plaintexts, i.e., the 
attack can not be mounted in a way that the attacker obtains and analyzes 
the plaintexts one by one. We give the estimated required storage capacity of 
the known plaintext- version in the fourth column. For example, to mount this 
attack for 4 rounds one would need to store 2^® plaintexts with corresponding 
ciphertexts, which is about 1 GByte. In this paper we present an attack that 

^ Although the attack of ITTT^ makes use of very sophisticated techniques, according 
to the authors the required amount of chosen plaintexts for the attack on 12 rounds 
might be reduced to 2®®. 
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requires a negligible storage capacity. We give the required amount of plaintexts 
in the fifth column of Table d 



Table 1. Complexities (Ig) of the attacks on RC5-32. 





Biryukov/Kushilevitz 


Our attack 


Rounds 


Chosen 

plaintexts 


Known 

plaintexts 


Storage 


Known 

plaintexts 


Storage 


4 


7 


36 


36 


28 


negligible 


6 


16 


40.5 


40.5 


40 


negligible 


8 


28 


46.5 


46.5 


52 


negligible 


10 


36 


50.5 


50.5 


64 


negligible 


12 


44 


54.5 


54.5 


- 


- 



Our attack is a linear attack, whose high success rate is based on a large 
linear hull-effect |lNyb94| . To our knowledge it is the first time that this effect 
has significant consequences in the evaluation of the resistance of a cipher against 
linear attacks. Furthermore we use techniques closely related to multiple linear 
approximations to set up a practical attack. 

Recently RC6 [H( Jti.l j has been submitted to NIST for the AES-Development 
Effort as a candidate to replace the DES as block cipher standard. The design of 
RC6 is based on RC5 and its public security analysis. Special adjustments were 
done to make RC6 resistant against the successful differential attacks on RC5. 
In this paper we also address the consequences for RC6 of our linear attack on 
RC5. 

The remainder of this paper is organized as follows. In Sect. 0 we describe 
some techniques from linear cryptanalysis and show their merits and limitations 
when applied to RC5 in Sect. 0 In Sect. 0we describe our attack on RC5 and we 
give experimental results on RC5-32 and RC5-64. We discuss the consequences 
for RC6 in Sect. 0and conclude in Sect. 0 

2 Linear Cryptanalysis 

Linear cryptanalysis was introduced and developed by Matsui in |Mat93IMat94| . 
Additional advanced techniques, which are relevant for this paper can be found 
in lkRfi4^Nvhfi4fCT?1 . A basic linear attack makes use of a linear approxi- 
mation between bits of the plaintext P, bits of the plaintext C and bits of the 
expanded key K. Such a linear approximation is a probabilistic relation that can 
be denoted as 

a-P®(i-K = -fC, (1) 

where a, j3 and 7 are binary vectors and y • A = 0^ Xi^i for X = (XOj Xi: • ■ •)• 
Instead of P or C one can use intermediate computational values that can be 
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computed from P or C under the assumption of a key value. If a linear appro- 
ximation holds with probability p = ^ + S with 5 yf 0, a linear attack can be 
mounted which needs about c|(5|“^ plaintexts. The value of c depends on the 
attacking algorithm that is used. Here S is called the deviation and |(5| = e is 
called the bias. 

Since the key K is fixed, m can be transformed into 0 without changing 
the bias. 

a-P = -i-C, (2) 

We say that for certain values P and C , behaves in the deviation direction 
ifa-P© 7‘C' = & and a • P © 7 • C = & has a positive deviation. 

A linear approximation for the whole cipher can be derived by ‘chaining’ 
linear approximations between intermediate values. If the probabilities of these 
approximations are independent, the value of the deviation of the derived appro- 
ximation can be computed with Matsui’s Piling Up Lemma jM athSj . This states 
that the deviation S oi n chained approximations with deviations Si is given by 

n 

3 RC5 and Linear Cryptanalysis 

RC5 is defined as follows. First 2r + 2 round keys Si G {0, 1}™, z = 0, . . . , 2r + 1, 
are derived from the user key0 If (Lo,Ro) G {0, 1 }™ x {0, 1 }*" is the plaintext, 
then the ciphertext (p2r+ii P2r+i) is computed iteratively with: 



L\ — Lq + Sq 


(3) 


R\ = i?o © “S'! 


(4) 


Ui = Li (B Ri 


(5) 


V^ = Ui<€. R,; 


(6) 


' 4+1 = Ci + Si+i 


(7) 


4+1 ~ Ri 


(8) 



for i = 1, ... ,2r. Here + denotes addition modulo 2^" and a; <C y rotation of 
w-bit word x to the left over y mod w places. The computation of {Li+ 2 , Ri+ 2 ) 
from {Li, Ri) with i odd is considered as one round of RC5. In Fig. 0a graphical 
representation of one round is given. 



3.1 Linear Approximations 

We shall consider the following linear approximations for xor, data-dependent 
rotation and addition. We only look at approximations that consider one bit of 

^ As the key schedule of RC5 has no relevance for our analysis we refer to |K,iv nni for 
a description. However for our experiments we have used the key schedule. 
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Fig. 1. One round of RC5. 



each term of the equation. The binary vector that has a 1 on position i and is 0 
everywhere else, will be denoted with e^. Let A = B ® C . Then: 

ei-A = e^-B®ei-C, 5 = 2~^ ( 9 ) 

for i G {0, . . . , u; — 1}. Let D = E F. Then: 

ei- D = ej ■ E®ek- F ®ek- {i- j), = (10) 

for i,j G {0, ... — 1} and k G {0, . . . , Ig w — 1}. (Here we have abused the 

“•’’-notation slightly to denote the fc-th bit in the binary representation of i — j.) 
If one is only interested in the bias of iOJ, one can leave out the term Ck- (i — j) 
or even Ck ■ F, see for example lEina- We use © and GDI) to pass the xor and 
rotation in RC5 as follows. Let j,k G {0, . . . , Ig u> — 1}. Then 

Cj * Ui = Cj • Fi (B Gj ‘ Ri^ 6 = 2 

Gk -Vi = 6j ■ Ui® Gj ■ Ri © Gj ■ {k - j), 6 = 



( 11 ) 

( 12 ) 
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Chaining these two yields: 

6k -Vi = 6j ■ Li® Cj ■ {k - j), 5 = (13) 

Finally, let G = 77 + S', where S is fixed. Then: 

eo-G = eo-77©eo-S, 5 = 2~^ (14) 

ei-G = ei- H®ei- S, 5 = 2~^ - 2~^ S[i] (15) 

for i G {1, . . . , w — 1}. Here S[a:] = S mod 2^^+^, hence the x LSB’s of S. Hence, 
depending on the key, the bias of (I I dll can vary between 0 and | . On the average 
it is 



3.2 Key Dependency and Piling-Up 



Because of the fact that (1141) has a bigger bias than (1151) . ‘traditionally’ appro- 
ximations on the LSB have been considered to be most useful for a linear attack 
(see |K YOOfK YfiHj V Using Approximations ®, (HU and (HU) one can derive the 
following iterative approximation for one round of RC5. 



Co ■ © eo • Si+i — Co ■ Ti+2 (* > 1), 5 — 2 ^ 

This approximation can be chained to I rounds as follows. 



(16) 



1-1 

Cq ■ Li® Co • Si+i+2j = 6o • Li^2l (* > 1); 5=1 (17) 

3=0 

According to the Piling Up Lemma m this approximation would have devia- 
tion i5 = if the chained approximations would be 

independent. This is however not the case. 

To illustrate this consider (O for 1 = 2. The deviation of this approximation 
depends on the Igrc least significant bits of Si+ 2 - This can be seen as follows. 
The probability that m, i-e., the approximation over the first round, holds 
depends on the value of Ri mod w. If Ri mod ic = 0 it always holds, otherwise 
it holds with probability Hence, the deviation of (HE!) is computed under the 
assumption that every value of Ri mod w is equally likely. If Ri mod w = 0 then 
the Igrc least significant bits of Li^i are known. Now consider the approximation 
for each possible value of 7?i+i separately. If Ri+i mod w G {0,...,lgw — l,w — 
lgw + l,...,w — 1} then, depending on the value of Si +2 mod w, part of the Ig w 
least significant bits of Ri +2 can be computed. It turns out that the values of 
are not equally likely. Hence, the Piling Up Lemma cannot be applied. 

To illustrate this effect we have computed the bias of the two round appro- 
ximation for RC5-32 for every value of 5^+2 mod w. These results are given in 
Table 0 It can be seen that the bias can vary significantly between Ri 2“^^^ 
and Ri 2“^®. On the other hand the average value is ^ l^a:l ~ 2“^^. Since 

5x7^0 for all x, the average amount of expected plaintexts needed based on 
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Table 13 is given by ^ ^ ~ 2^^. Both are in accordance with the Piling 

Up Lemma. 

Note further that for two values of mod w the deviation is negative. For 
those values this means that if one would mount a basic linear attack (Algo- 
rithm 1 in on four rounds using 113 to find S'o 0 S '2 © <S '4 based on the 

Piling Up Lemma, most likely one would find S'o © S '2 0 S '4 © 1, given enough 
texts. 

We can conclude that although the deviation can vary significantly, the Piling 
Up Lemma gives a good estimate for the average bias for RC5. We will show 
that this estimate can be used to compute the expected average success rate of 
a linear attack that does not use the sign of the deviation, but only its absolute 
value; the bias. 



Table 2. The deviation Sx of Approximation (II Vjl with I — 2 for RC5-32, depending 
on a; = Si +2 mod w. 



X = Si+2 mod w 


10^5, 




X = Si+2 mod w 


10 ^ 4 ,, 




00 


0.930786 


-10.07 


10 


0.228882 


-12.09 


01 


0.991821 


- 9.98 


11 


0.656128 


-10.57 


02 


0.473022 


-11.05 


12 


0.015259 


-16 


03 


0.564575 


-10.79 


13 


-0.015259 


-16 


04 


0.595093 


-10.71 


14 


0.137329 


-12.83 


05 


0.686646 


-10.51 


15 


0.534058 


-10.87 


06 


0.473022 


-11.05 


16 


0.503540 


-10.96 


07 


0.503540 


-10.96 


17 


0.717163 


-10.45 


08 


0.595093 


-10.71 


18 


0.625610 


-10.64 


09 


0.564575 


-10.79 


19 


0.778198 


-10.33 


Oa 


0.289917 


-11.75 


la 


0.747681 


-10.39 


Ob 


0.381470 


-11.36 


lb 


0.839233 


-10.22 


Oc 


0.411987 


-11.25 


Ic 


0.381470 


-11.36 


Od 


0.289917 


-11.75 


Id 


0.381470 


-11.36 


Oe 


0.106812 


-13.19 


le 


0.076294 


-13.68 


Of 


0.228882 


-12.09 


If 


-0.045776 


-14.42 



3.3 Key Dependency and Linear Hulls 

The concept (approximate) linear hull was introduced in |Nyb94| . We will use 
the term linear hull in the way it was used in A linear hull is the set of 

all chains of linear equations over (a part of) the cipher that produce the same 
linear equation. The existence of this effect for RC5 was first noticed in pro 
where also some preliminary work in determining the linear hull effects 
for RC6 (and some simplified versions) can be found. 

The following linear hull for an approximation of two rounds (i till i + 3) of 
RC5 can be noticed. Let jj G {0, . . . , ic — 1}. Using (0), IIIUII . II 1411 and lllhll for 
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j the following Ig w approximations for two rounds can be derivec0. 

Cj * (33 Cj ' {k J ) (33 * Si-^± — e/c * ^ 

for fc = 0,...,lgr(;— 1. Likewise for the next two rounds and I also Ig w appro- 
ximations can be derived: 



■ -bj+2 (33 Cfc ' (33 6/ ■ »5*2-|_3 — €■[ * -/vj_|_4, S — ^k,l 

for fc = 0, . . . , Ig w — 1. Hence, one can chain Ig w pairs of (El and (m to obtain 
the two round approximation 



Cj ‘ (33 Ck ‘ (33 6/ * *S^^-|_3 (33 Gj * (A: (33 Gk * A:) c/ ■ Avj_j_4, S (^(^) 



Neglecting the key-dependency of chaining and using the Piling Up Lemma one 
gets for this bias 



^j,k,l — 



2 ~ 2 Ig w— 1 


II 

0 

II 

0 


2-2ig“-i(l - 2~^+^Si+i[k - 1]) 


0 

II 

0 


2-2ig“-i(l - 2-^+^S^+z[l - 1]) 


II 

0 


2-2ig — 1(1 _ 2 -'=+i,S,+i[A: - 1])(1 - 2-^+^S,+3[1 


— l])A:yf0,^yf0 



(21) 

We note two things about (12011 and its deviation given in (f2 1 1 . Firstly it is clear 
from (EB that the deviation is key dependent, i.e., dependent on the Igw — 1 
LSB’s of Si+i and Si+ 3 . But we will show that because of the linear hull effect, 
this key dependency is negligible. Secondly, for each choice of the triple j, k, I 



the term Cj^k,i = Sk ■ Si+i © e/ • 5^+3 © ej ■ {k — j) (Bek-{l — k) is constant, either 0 
or 1. This constant actually determines the sign of the deviation of the following 
approximation, which can be derived from (EUl by leaving out Cj^k.i- 



— Gi * S — ^j,l • 



(22) 



For the deviation the following holds: 

Sj^i = ( 23 ) 

fcGV^.i 

where Vj^i = {k £ {0 , . . . ,lgw - l}\Cj^k,i = 0}. 

One can extend approximation (El to hold for r subsequent rounds. In this 
way one gets the following approximation for r rounds. 

Cj * Li — G-i * Li^ 2 rt ^ ~ ^j,h (24) 

where i,j € {0, . . . ,w — 1}. The deviation can be computed/approximated by 
considering the parts of the different chains, for the A:-th round in the chain given 

by 

■ Li+2k = 64 • Li+2k+2, ( 25 ) 



^ Note that in Equation EH) and others 5j^k is not the Kronecker delta, but a variable 
5 with indices j and k. 
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where k C { 0 , . . . , r — 1 } and 



, I = j if /c = 0 

J G { 0 , . . . , Ig w — 1 } if fc yf r — 1 
iffc = r — 1 

jk = Ik-i for A: G {1, . . . , r - 1}. 



In this way it can be seen that Equation (I24II is a linear hull that consists of 
(Igtc)’’”^ different chains, namely for all of the choices of jk,h- When we take 
the average bias of | for the approximation of addition on non-LSB’s, we get for 
the bias of (EH: 



\ki\ 



c{r) if / = 0 
|c(r) if / yf 0 



(26) 



where c(r) can be estimated as 



(r) 7 (^ ^ ^\lgw- i)k 2 i-iew-i)r+r-i-k^ 



r-i. / I 









= - 1 + 2y-^) 

= 7(-fi^V") 



2w V 2w 



(27) 



where 7 is a factor that accounts for the effect of different chains that cancel 
each others contribution. The increasement factor c+(r) of the bias is expressed 
as 

c(r + 1 ) = c^{r) ■ c(r), for r = 2 , . . . (28) 

Hence, if a linear attack can be mounted on RC5 with r rounds using x known 
plaintexts, then this attack will have the same success probability if adapted to 
RC5 with (r + 1) rounds if (c“*'(r))“^x known plaintexts are available. From ir27l 
and (E3) follows . Now, for RC5-32 we have c“*'(r) k. 2~^ ‘^ and for 

RC5-64 we have c+(r) Ri 2“^'^. In Sect. 14.41 we will show that the attacks that 
we have implemented for RC5-32 and RC5-64 approximately behave according 
to the previously derived approximate expectations. Hence, we can neglect the 
key dependency in (El: the bias given according to (El is a sufficient practical 
estimate. 



4 The Attack on RC5 

In this section we derive a linear attack on RC5 and present an overview of 
the experimental results. In Sect. n. II we give the linear approximation and the 



24 



J. Borst, B. Preneel, and J. Vandewalle 



method that is used to guess key bits. In Sect. 14.21 we specify this method. In 
Sect. 14., SI we describe the search algorithm. Finally, in Sect. 14.41 experimental 
results of the implemented attack are given. 

4.1 The Linear Approximations 

We have derived and implemented a linear attack that uses approximation dSI). 
Since this approximation does not involve any key bits, a basic linear attack is 
not possible. In particular for an r-round attack we will use with i = 0 and 
k = r. The first addition, adding Sq, will be passed with an approximation on 
the LSB, since this has bias Therefore we take j = 0 in lj‘24y . We also choose 
Z = 0 in d21D because then also the last key-addition in the whole approximation 
will be passed with bias It might be possible to further improve the attack 
by (also) using approximations with other values for j and I, but we have not 
found a method to do this. 

To attack r rounds of RC5 we use the fact that each linear path that is part of 
the hull given by (I24II is a chain of r 1-round approximations of the form 42,411 . We 
will split the approximation into two parts. The first contains the approximation 
for the first key addition and the first round. This gives us the following Igw 
approximations. Each is the first part of a set of linear approximations that is 
contained in the linear hull of ll'24ll . 

eo ■ To = Cfc • L3 for fc = 0, . . . , Ig w - 1. (29) 

The remainder of the whole approximation can be specified by the following Ig w 
approximations, each beginning with a different bit of L3: 

Cfe • L3 = eo • L 2 r+i for A: = 0, . . . , Ig w - 1 . (30) 

When for a certain plaintext encryption the intermediate value i?i mod w = 
k, hence an element of {0, 1, . . . ,lg w — 1}, then (I29II behaves in the deviation 
direction. Hence, if also m behaves in the deviation direction then the whole 
approximation behaves in that direction. On the other hand, if Ri mod w ^ 
{0 ,l,...,lg?u — 1} then the probability that the whole approximation behaves 
in the direction of the deviation is much lower. 

In the attack we want to check for every text if one of the approximations 
that correspond to Ij.Sl was followed. Since we have no information about any 
intermediate values we do not have a criterion that always holds. Instead we 
will derive a function that is expected to give higher values when one of the 
approximations was followed. Hence this function will have higher values for 
encryptions where Ri mod re G {0 ,l,...,lgw — 1} than for other i?i values. 
Because i?i mod w = (i?o — >S'i) mod w and Rq is known we can guess Si mod w 
from this. We call this function the non-uniformity function. In the next section 
we will describe it. 

We note here that the use of such a function fits in the frameworks of stati- 
stical cryptanalysis as described by Vaudenay in |Vau9(TlJ or partitioning crypt- 
analysis as described by Harpes and Massey in mm- 
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4.2 A Non-uniformity Function 

The non-uniformity function v computes non-uniformity values for a given set 
of corresponding plaintext/ciphertext pairs. This set is divided into w subsets, 
each set contains plaintexts with the same i?o mod m- value. For each set a non- 
uniformity value can be computed. 

We look at the last round of the encryption. Suppose that one approximation 
of the linear hull corresponding to fTnil is followed up to the last round. Say that 
in this particular case Cfc • T 2 r-i was biased for some k S {0, . . . , Ig w — 1}. 
Then following the approximation to the end would mean that i?2r-i mod w = 
(ui — k) mod w with a higher probability than other values. As seen in Sect. 15.21 
depending on the subkeys also the other possible values of i?2r-i mod w might 
not be uniformly distributed. But in any case certain values of i? 2 r-i mod w 
will have a higher probability when 45011 behaves in the deviation direction then 
when it does not behave in that way. 

If we would know the value of R^r-i mod w it would be possible to compute 
Ig w bits of S 2 r+i or two possible values for those bits from the ciphertext with: 



since the values of R 2 r+i and L 2 r+i are known from the ciphertext. We will 
use S{n) to denote the Igw-bit string, given by the bits {n + Igw — 1) mod 
re, . . . , (n -I- 1) mod w, n of S. The value of L 2 r+i mod w determines for which 
Igw bits of S 2 r+i information can be computed, namely S{L 2 r+i mod w) . If 
L 2 r-i-i mod w = 0 then the Ig w LSB’s can be computed. When L 2 r+i mod w ^ 0 
we can compute two values for S{L 2 r+i mod w). The carry bit of the addition 
in (EO) determines which one is the correct one. 

In the attack we do not know the value of i? 2 r-i rnod w or even which value 
would be the most probable. Instead of trying to compute Igw bits of S 2 r+i 
we make a similar computation for a value which we call S' . It is computed by 
taking i?2r-i mod w = 0 in (15 1 II . i.e.. 



Due to the non-uniform distribution of i?2r-i mod w it is expected that the 
distribution of 5"(-)-values will be more non-uniform for encryptions where the 
approximation was followed than for others. 

Hence, in the attack we use a counter array A(i, j, k) for i,j,k = 0, . . . ,w — l, 
where each i corresponds to a possible value of Rq mod w, each j to a possible 
value of L 2 r+i mod w and each A: to a possible value of S'{L 2 r+i mod w). For 
each text we check if the approximation holds. If it holds, we change the counter 
array as follows. If L 2 r+i mod w = 0, we increase A{Rq mod w, L 2 r+i mod w, v) 
by 2, where v is the suggested S'{L 2 r+i mod w)-value. If L 2 r+i mod w yf 0, we 
increase A(i?o mod w, L 2 r+i mod w, vq) and A(Rq mod w, L 2 r+i mod w, t>i) by 
I, where Vq and V\ are the suggested values. If the approximation does not hold, 
we decrease the specific array entries accordingly. 

Each (i?o mod w, T 2 r+i mod w)-combination gives a distribution of S' {■)- 
values. From Sect. 14. 1 1 we know that the distributions corresponding to the values 



S2r+l — R2r+l ~ (^2r-l © L2r+l) ^ T2r+1 



(31) 




(32) 
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R\ mod w G {0, . . . , Ig w — 1} will be the most non-uniform. To measure the non- 
uniformity we check for all w bits of our S' based on the S"(-)-values how many 
times 0 is suggested and how many times 1 and take the difference of these 
amounts. For each i?o mod w we take the sum over all possible L 2 r+i mod w of 
the absolute values of these differences. In this way the non-uniformity function 
: {0, . . . , w — 1} — >■ N is defined: 

w — 1 Igit; — 1 

+ a;, u) — Z] + X,v)\, 

hr+ 1—0 x—0 v:ex-v—0 v-.e^-v—l 

(33) 

where all indices of A are taken modw. We call the w values that are derived 
in this way the non-uniformity values. We expect that the sum of Ig w non- 
uniformity values will be maximal for the values corresponding to texts with 
i?i mod w £ {0, . . . , Ig re — 1}. The final step of the algorithm guesses the value 
of S'! mod w accordingly. 

The observant reader will have noticed that according to the above descrip- 
tion it is not necessary to have counters for the 5"(-)-values. Instead of this one 
could use counters corresponding to the bits of S' and change these accordingly. 
The description above is used to emphasize that with the S'^(-)-distribution also 
other non-uniformity measurements could be used. For example, one could look 
at all possible values for two subsequent bits of S'. Also it is probably possible to 
derive information about the actual value of 5'2r-i-i from the S''(-)-distribution. 
However, this falls outside the scope of this paper. 

4.3 The Algorithm 

1. Acquire n known plaintext/ciphertext-pairs (Pqj C'o), • ■ • > {Pn-i, C'n-i)- 

2. Initialize a counter array@ A{i,j, k) := 0 for j, fc = 0, . . . , w — 1. 

3. For each plaintext/ciphertext-pair do: 

If L 2 r+i mod w = 0 then 

a) Compute S"(0)-guess v. 

If eo • To = eo • T 2 r-i-i then A(i?o, L 2 r+i,v') := A(i?o> p 2 r-i-i,'v) -I- 2. 

If eo • To = eo • p 2 r-t-i © 1 then A(i? 0 ) p 2 r-t-i,v) '■= A{Rq, L 2 r-\-i,v) — 2. 

If T 2 r-i-i mod ui yf 0 then 

a) Compute S'{L 2 r-i-i mod w)-guesses vq and vi. 

If eo • To = eo • T 2 r-i-i then A(i?o> T 2 r-i-i, uq) := A{Rq, T 2 r-i-i, uo) + I 
and A(i? 0 ) T 2 r-i-i, "yi) := A(i? 0 ) T 2 r-i-i, ui) + I. 

If Co • To yf eo • L 2 r+i then A{Rq, T 2 r+i, uq) := A{Rq, L 2 r+i,vo) — 1 
and A(i? 0 ; T 2 r-i-i, ui) := A{R q, T 2 r-i-i, I’l) — 1. 

4. Compute w non-uniformity values 12 ( 1 ) according to (1.331 . where i corresponds 
with a value of i?o mod w. 

5. Find the value x G {0, . . . , u; — 1} for which + i) mod u>)| is 

maximal. 

6. Guess S\ mod w = w — x. 

For clarity in the following description of the algorithm we have left out mod w 
when referring to indices of A. 
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4.4 The Results 

We have implemented the attack on RC5-32 and RC5-64. The results are given in 
Table 0 and Table 0 As can be seen in the tables, we have done tests for up to 5 
rounds of RC5-32 and up to 4 rounds of RC5-64. For each number of rounds tests 
were performed for several amounts of plaintexts. To give an indication of the 
practical aspects of the experiments: with our RC5-implementation carrying out 
the attack on the 5-round version for 10 keys with 2^^ plaintexts took about 21 
hours on a 333 MHz Pentium. To our knowledge these are the first experimentally 
executed known plaintext attacks on reduced versions of RC5, which require a 
negligible storage. 

As stated at the end of Sect. l.'l.dl we would expect that an attack on r rounds 
of RC5 with X known plaintexts should have the same success probability as an 
attack on r-l-1 rounds with texts. For RC5-32 it holds that ~ 2®-®, 

for RC5-64 (c+)-^ « 2®-4. It can be seen from the tables that the results are 
better than expected, i.e., the factor to attack an extra round is ~ 2® for RC5- 
32 and ~ 2® for RC5-64. We conjecture that the reason for this is that the 
value of Ri +2 mod w depends significantly on the value of Ri mod w. We are 
still researching this problem, but we give some preliminary evidence in the next 
section, where we discuss the consequences for RC6. 

Based on the experimental results and the theoretical estimation of the bias 
of the linear approximation we can estimate the complexity of the attack on 
more than 4 or 5 rounds (cf. Table EJ . It follows that an attack can be mounted 
on RC5-32 with 10 rounds that has a success probability of 45% if 2®^ plaintexts 
are available. An attack on RC5-64 with 15 rounds has a success probability of 
90% if 2^^® plaintexts are available. 

5 Consequences for RC6 

The block cipher RC6 has been submitted to NIST as an AES-candidate. 

Its design was based on RC5 and the security evaluation of RC5. To meet the 
block size requirement of 128 bits and to keep a 32-bit processor oriented design 
for this block size, RC6 was designed as two RC5-32’s (with some changes) in 
parallel that interacl0. Hence, cryptanalysis of RC5 can mostly be adapted for 
analysis of RC6. 

However, the RC5-structure used in RC6 differs from the original version. 
One of the most important differences is the following. The amount of rotation 
in RC5-32 was determined by taking the 5 LSB’s of a 32-bit word. In RC6, the 
5 bits that determine the amount of rotation depend on all 32 bits of a 32-bit 
word. 

In the first place these changes to RC5 were made to preclude the successful 
differential attacks on RC5 lfik9slR(;b.1l . These attacks make use of the fact 

® Actually, the design of RC6 also is word oriented and the blocksize is 4w, where w 
is the word size. We only discuss the 128-block size version, as it is the main object 
for the AES standardization process. 
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Table 3. Experimental results of the attack on RC5-32. 



Rounds 


Known plaintexts 


Success rate 


2 


213 


28/100 




214 


46/100 




215 


89/100 




216 


92/100 


3 


2 ^' 


7/100 




218 


15/100 




219 


28/100 




220 


49/100 




221 


69/100 




222 


81/100 


4 


2^h 


7/100 




225 


26/100 




226 


44/100 




227 


77/100 




228 


82/100 


5 




4/10 




233 


7/10 




234 


9/10 



Table 4. Experimental results of the attack on RC5-64. 



Rounds 


Known plaintexts 


Success rate 


2 


2 ^' 


39/100 




218 


82/100 




219 


96/100 


3 


2^5 


28/50 




226 


40/50 




227 


47/50 


4 


2^33 


9/10 



Table 5. Expected number of plaintexts needed for a known plaintext attack on r(> 2) 
rounds of RC5-32 or RC5-64. 



Success probability: 


45% 


90% 


RC5-32 

RC5-64 


2^ + t>T 
2l+8r 


24+t>r 

23+8r 
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that only five LSB’s determine the rotations. However, these changes also provide 
increased resistance to the attack method described in this paper. 

To illustrate this we look at a transition version between RC5 and RC6. In 
the definition of RC5, replace m and ® with 



r, = (i?,(2R, + 1)) < 5 


(34) 


Ui = Li® Ti 


(35) 


U, = U, < T, 


(36) 



For our theoretical analysis concerning the linear hull effect, this change makes 
little difference. One can still use the same linear approximations. However, the 
first round trick and last round trick we have used now become more complicated. 
To compute Ti mod 32, one has to guess all bits of and the construction of a 
non-uniformity function is not obvious. 

We have done tests on the above described intermediate version with the 
last and first round replaced with a normal RC5-round. Then the first and last 
round trick are straightforward. We have implemented the attack for 3 and 4 
rounds of the cipher and results indicate that the increase in the number of 
necessary plaintexts for the same success probability was more in accordance 
with the theoretical results for RC5. Hence, the application of the extra function 
to determine the rotation amounts causes these values to be more independent. 

We conclude that our attack does not give an obvious possibility to mount a 
realistic attack on RC6. Currently we are working on a precise evaluation of its 
resistance against this attack method. 

6 Conclusions 

In this paper we have evaluated the resistance of RC5 against linear attacks. We 
have taken into account the applicability of the Piling Up Lemma and the conse- 
quences of linear hull effects, both in combination with possible key dependency. 
This resulted in estimates for the complexity to mount a linear attack. 

Furthermore we have described an attack that exploits the linear hull effect 
that we described and implemented it on reduced versions of RC5-32 and RC5- 
64. In this way we estimate that our attack can theoretically break RC5-32 with 
10 rounds and RC5-64 with 15 rounds. In comparison with previous attacks on 
RC5, our attack needs negligible storage capacity, i.e., it could be practically 
implemented. 

The attack method has no serious consequences for the security of RC6. 
Apparently the precautions that the designers made to make RC6 more resi- 
stant against differential attacks, also made RC6 more resistant against more 
sophisticated linear attack methods. 
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Abstract. The block cipher CRYPTON has been proposed as a can- 
didate algorithm for the Advanced Encryption Standard (AES). To fix 
some minor weakness in the key schedule and to remove some undesira- 
ble properties in S-boxes, we made some changes to the AES proposal, 
i.e., in the S-box construction and key scheduling. This paper presents 
the revised version of CRYPTON and its preliminary analysis. 



1 Motivations and Changes 

The block cipher CRYPTON has been proposed as a candidate algorithm for the 
AES P2| • Unfortunately, however, we couldn’t have enough time to refine our al- 
gorithm at the time of submission. So, we later revised part of the AES proposal. 
This paper describes this revision and analyzes its security and efficiency. 

CRYPTON vl.O is different from the AES proposal (vO.5) only in the S-box 
construction and key scheduling. As we mentioned at the 1st AES candidate 
conference, we already had a plan to revise the CRYPTON key schedule. The 
previous key schedule was in fact expected from the begining to have some 
minor weaknesses due to its too simple round key computations (actually a 
slight weakness was found by Serge Vaudenay etc. at ENS (posted at NIST’s 
AES forum) and by Johan Borst 0). We thus made some enhancements to 
the original key schedule, while trying to keep changes minimal. The new key 
schedule now makes use of bit and word rotations, as well as byte rotations. 
We also used distinct round constants for each round key. This way we tried to 
make each byte of expanded keys used in different locations (and different bit 
positions within a byte) of 4 x 4 byte array in different rounds. The new key 
schedule still runs much faster than one-block encryption. 

In vO.5 we used two 8x8 S-boxes constructed from 4-bit permutations using 
a 3-round Feistel cipher. Such S-boxes, however, turned out to have too many 
low- weight, high-probability characteristics that may cause weak diffusion by 
the linear transformation following the S-box transformation. For example, the 
S-boxes used in VO.5 have about 300 characteristics with probability 2 ~^ and 
160 linear approximations with probability 2“^. Furthermore, some of such I/O 
pairs turned out to have minimal diffusion under linear transformations. Though 
we could achieve reasonably high security bounds even with such S-boxes, we 
wanted to make CRYPTON more stronger for long-term security by allowing a 
large safety margin. We thus decided to strengthen the S-box in this opportunity. 



L. Knudsen (Ed.): FSE’99, LNCS 1636, pp. 31-^^ 1999. 
© Springer- Verlag Berlin Heidelberg 1999 



32 



C.H. Lim 



Experiments show that most Feistel-type constructions seem to generate S- 
boxes with too many high-probability characteristics, so we decided to use other 
construction methods. We first started with a Feistel structure involving three or 
four 4-bit permutations and repeated modifications and testings of the structure 
to get better 8-bit S-boxes. In the end we arrived at the structure of a SP 
network described in Sect. 3. 2. The resulting S-boxes are much stronger against 
differential and linear cryptanalysis when combined with linear transformations 
used. We decided to use four variants of one S-box, instead of independent four S- 
boxes, to allow greater flexibility in memory requirements (e.g., for cost-effective 
implementations on smart cards). 

Finally, we would like to stress that the above modifications do not require 
any substantial change in existing analysis on the security and efficiency. The 
security evaluation of the new version can be done only by replacing old figu- 
res replated to S-box characteristics with new ones and there is no change in 
the overall structure of key scheduling. The performance figures in software im- 
plementations remain almost the same. The hardware complexity is a little bit 
increased due to the increased complexity for logic implementation of S-boxes. 

Throughout this paper we will use the following symbols and notation: 

— A 4 X 4 byte array A is represented by 



— left rotation of X by n-bit positions. 

— left rotation of each byte in a 32-bit number A by n-bit positions. 

— fog: composition of functions / and g, i.e., (/ o g){x) = f{g{x)). 

— A,0: bit-wise logical operations for AND and XOR, respectively. 

2 Algorithm Specifications 

CRYPTON processes a data block of 16 bytes by representing it into a 4 x 4 byte 
array as in SQUARE pj. The round transformation of CRYPTON consists of 
four parallelizable steps: byte-wise substitutions, column-wise bit permutation, 
column-to-row transposition, and then key addition. The encryption process in- 
volves 12 repetitions of (essentially) the same round transformation. The decryp- 
tion process can be made identical to the encryption process with a different key 
schedule. This section presents a detailed description of CRYPTON vl.O. 

2.1 Basic Building Blocks 

Nonlinear Substitution 7 The nonlinear transformation 7 consists of byte- 
wise substitutions on a4x4 byte array by using four 8x8 S-boxes, Si {0 < i < 3), 
such that S 2 = and S 3 = S']” ^ (see Sect. 3. 2 for details). Two different S-box 
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arrangements are used in successive rounds alternately; 70 in odd rounds and 7 e 
in even rounds. They are defined as 

^ — 7o(^) ~ ^i+j mod 

^ ~ Te(^) ^ij ~ ^i+j +2 mod 

Observe that the four S-boxes are arranged so that 7 “^ = Je and 7 “^ = Jo- This 
property will be used to derive identical processes for encryption and decryption. 

Bit Permutation tt The bit permutation tt bit-wise mixes each byte column 
of 4 X 4 byte array using four masking bytes m-i’s given by 

mo = Oxfc, mi = 0xf3, m 2 = Oxcf, m 3 = 0x3f. 

We first define four column permutations tt^’s (0 < i < 3) as 

[^3 5 ^2 5 5 ^0] — ( [^3 1 ^2 1 5 Rq] ) — tB/c— 0 mod 4 A 0 -k^ , 

The bj can be expressed alternatively using bit extraction and xoring as 

bj = mod 4 A Q-k) 0 U5 

where mt denotes bit-wise complement of mk and a = 

As in 7 , we use two slightly different versions of bit permutation to make 
encryption and decryption processes identical: tTq in odd rounds and TTg in even 
rounds. Let A* be the i-th byte column of a 4 x 4 byte array A, i.e., A* = 
(o3i5 02i, oii, ooi)*- Then the bit permutations tTo and TTg are defined as 

7To(A) = (7r3(A^),7r2(A‘^),7ri(A^),7To(A°)), 

7Te(A) = (7ri(A3),^o(A2),7r3(Ai),7T2(A°)). 

Note that 7t~^ = tTq and = TTg and that if TTo{[d, c, &, o]*) = [h, g, f, e]*, then 

ni{[d,c,b,aY) = [e,h,g,f]\ n 2 {[d,c,b,af) = [f,e,h,g]\ 7T3([d, c, 6, a]*) = [g,f,e,h]\ 

This property will be used to derive an efficient decryption key schedule from 
the encryption key schedule (see Sect. 2. 3). 

Byte Transposition r It simply moves the byte at the (z, j)-th position to the 
(j, z)-th position, i.e., B = r(A) <t7 bij = aji. Note that = r. 

Key Xoring cr For a round key K = {K[S\, K[2], K[l], K[0\y , B = <tk(A) is 
defined by B[i] = A[i](B K[i] for 0 < z < 3. Obviously, = uk- 

Round Transformation p One round of CRYPTON consists of applying 7 , tt, r 
and a in sequence to the 4x4 data array. More specifically, the odd and even 
round functions are defined (for round key K) by 



PoK = (4k o t o TTo o Jo for odd rounds, 
PeK = <4k o t o TTe o 7g for even rounds. 
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2.2 Encryption and Decryption 

Let Kl be the i-th encryption round key consisting of 4 words, derived from a 
user-supplied key K using the encryption key schedule. The encryption trans- 
formation Ek of 12-round CRYPTON under key K consists of an initial key 
addition and 6 times repetitions of Po and Pe and then a final output transfor- 
mation. More specifically, Ex can be described as 

Ex = 4>e° PeKl^ ° PoKl^ O • • • O p^^x^ ° PoKl ° O' x« , ( 1 ) 

where (j)e is an output transformation to make encryption and decryption pro- 
cesses identical and is given by t o 7Te o t. Similarly we define 4>o = t o tTq o r. 

The corresponding decryption transformation Dx can be shown to have the 
same form as Ex, except for using suitably transformed round keys: 

Dx = 4>e° PeKl^ ° PoK].^ O • • • O O O a xO, (2) 

Ct d CL Ct Cl 

where the decryption round keys are defined by 

r-t ^ { (t>ei.Kl) for i = 0,2,4, •••, , . 

\ct)o{Ki) fori = 1,3,5, •••. 

This shows that decryption can be performed by the same function as encryption 
with a different key schedule. 

Notice that 

4>e° ° 4>e = CL O (f)^ for i = 0, 2, 4, • • • , 

4>o o ° 4>o = (T ° for i = 1, 3, 5, • • •. 

Using this property, we can incorporate the output transformation into the 
final round as o PeKl'^ = ° t o yg. 

2.3 Key Scheduling 

CRYPTON requires total 4 x 13 = 52 round keys each of which is 32 bits long. 

These round keys are generated from a user key of 8/c (A: = 0, 1, • • • , 32) bits in 

two steps: first nonlinear-transform the user key into 8 expanded keys and then 
generate the required number of round keys from these expanded keys using 
simple operations. This two-step generation of round keys is to allow efficient 
on-the-fly round key computation in the case where storage requirements do not 
allow to store the whole round keys (e.g., implementation in a portable device 
with restricted resources). It also facilitates hardware implementations. 

Generating Expanded Keys Let K = k^i ■ ■ ■ kiko be a 256-bit user key. 
We first split K into U and V such that U[i] = ksi+Qksi+Aksi+ 2 ksi and V[i] = 
kg,i+ikg,i+^kg,i+^kg,i+i for i = 0,1,2, 3. Then we compute the 8 expanded keys 
Ee[i] (0 < i < 7) using round transformations with all-zero key as 

U' = po{U), V' = P,{V), 

Eg[t] = U'[i\ © Ti, Ee[i + 4] = V'[i\ © To, 

where Tq = ©f=o^^W Ti = ©f^o^^W- 
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Generating encryption round keys The following 13 round-constants will 
be used for encryption key schedule: 

(7e[0] = 0xa54ff53a, Ce[i] = C^[i-l] + 0x3c6ef372 mod 2®^ for i = 1, 2, • • • , 12. 

In addition, we also use the following 4 masking constants to generate distinct 
constants for each round key from a given round constant: 

MCo = Oxacacacac, MCi = for i = 1, 2, 3. 

1. compute the round keys for the first 2 rounds as 

K,[i] ^ 

Ee\i -\- — Ee[i -f 4] 0 t7e[l] 0 MCi for 0 ^ t ^ 3. 

2. for rounds r = 2, 3, • • • , 12, repeat the following two steps alternately: 

2-1. even rounds: 

{Se[3],Se[2],l?e[l],15e[0]}^ {l?e[0]«'>®,i5e[3]«'>®,Se[2]«l®,Se[l]«'"}, 
Ae[4r 0 i] Ee[i] 0 Ge[r] 0 MCi for 0 < i < 3. 

2-2. odd rounds: 

Ae[4r 0 i] <— i?e[i 0 4] 0 Ce[r] ® MCi for 0 < i < 3. 

Generating decryption round keys For efficient decryption key schedule, 
first observe that 4>o = t o tTo o t and (j>e = t o o t can be rewritten as 

MA) = {MAm),MA[2]),MA[i]),MA[o]))\ 

MA) = iMA[S]),MA[2]),MA[l]),MA[0])Y. 

Here 4>i is actually the same as except that 4 input bytes are now arranged 
in a row vector (see Sect. 2. 1.2). Also note the shift and linear properties of 4>i 

for fc = 1, 2, 3, 

4>i{X) = for j = i 0 1 mod 4, 

for k = 1,2,3, 

4>i{A[j] 0 G) = (!>i{A[j]) 0 <j)i{C). 

In particular, (j)i{C) = G if G consists of 4 identical bytes. Using these properties, 
we can design a decryption key schedule similar to and almost as efficient as the 
encryption key schedule as follows (Decryption round constants Gd[t]’s are given 
by Cd[i] = <() 2 (Ge [12 — z]) for even z’s and Cd[i\ = </>o(Ge[12 — z]) for odd z’s.): 

1. compute the expanded keys and round constants for decryption as follows: 

{£:d[3],£:d[2],£:d[l],£;d[0]} ^ {,^o(Be[i;)<<'’",c^i(£^e[Ol),^i(£.[3])«'-",</>2(£;e[2])«'’"}, 
{Ad[7],Ad[6],£;d[5],£;d[4]} ^ {b2(Ae[6])«^^ bo(Ae[5])«^^ 
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2. compute the first 8 round keys as 

K4i\ ^ Ea\i] © © MCi, 

Ka[i + 4] ^ Ea[i + 4] © C7d[l]«®"-®* © MCi for 0 < i < 3. 

3. for rounds r = 2, 3, • • • , 12, repeat the following two steps alternately: 

3-1. even rounds: 

{£43], £42], £41], £40]} ^ {£42]«'>4 £41]««, £40]«^®, £43]«‘'|. 

Kd[4r + i] ^ £4i] © C4r]«®^-®' © MCi for 0 < i < 3. 

3-2. odd rounds: 

(£d [7] , £d [6] , £d [5] , £d [4] } ^ (£44] «'>® , £^ [7] , Ea [6] , £45] «'>® } , 

Kd[M + i] ^ £4i + 4] © £4r]«®2-®' © MCi for 0 < i < 3. 

3 Security Analysis 

3.1 Diffusion Property of Linear Transformations 

Due to memory requirements, small size S-boxes are commonly used in most 
block cipher designs and thus effective diffusion of S-box outputs by linear trans- 
formations plays an important role for resistance against various attacks such as 
differential and linear cryptanalysis (DC and LC for short) 

From Sect. 2. 1.2, we can see that it suffices to consider any one component 
transformation tt^ of tt to examine the diffusion property of tt, since tt acts on 
each byte column independently. It is also easy to see that any column vector 
with n (n < 4) nonzero bytes is transformed by into a column vector with at 
least 4 — n nonzero bytes (we call this number 4 the diffusion order of TTi). This is 
due to the operation of exclusive-or sum in tt. More important is that the number 
of such input vectors giving minimal diffusion is very limited. This is due to the 
masked bit permutation. Table 1 shows the distribution of diffusion orders by tt^ 
over all 32-bit numbers. We can see that there are only 204 values achieving the 
minimum diffusion order 4 and about 99.96 % of 32-bit numbers have diffusion 
order 7 or 8. This shows the effectiveness of diffusion by our combined linear 
transformation r o tt in successive rounds. 



diffusion order 


4 


5 


6 


7 


8 


no. elements 


204 


13464 


1793364 


130589784 


4162570479 


ratio 


4.75 X 10““ 


3.13 X 10"® 


4.18 X lO""* 


3.04 X 10”^ 


96.92 X 10"^ 



Table 1. Distribution of diffusion orders under rri 



Let us examine in more detail the set of 32-bit numbers giving minimal 
diffusion. For this, we define two sets of byte values, Qx and f2y, as 

fix = (0x01, 0x02, 0x03, 0x04, 0x08, OxOc, 0x10, 0x20, 0x30, 0x40, 0x80, OxcO}, 
fly = (0x11, 0x12, 0x13, 0x21, 0x22, 0x23, 0x31, 0x32, 0x33, 0x44, 0x48, 0x4c, 
0x84, 0x88, 0x8c, 0xc4, 0xc8, Oxcc} U fix- 
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Let Ij be a set of input vectors with j nonzero bytes which are transformed by 
TTi into output vectors with 4 — j nonzero bytes. Then all possible 32-bit values 
with minimum diffusion can be obtained as: for each x in and y in Qy, 

h = {(0, 0, 0, x)\ (0, 0, x, 0)‘, (0, a:, 0, 0)*, (x, 0, 0, 0)*}, 

h = {(0, 0, X, xf, (0, X, X, 0)*, (x, X, 0, 0)*, (x, 0, 0, xf, (0, y, 0, yf, {y, 0, y, 0)*}, 

I 3 = {(0, X, X, x)*, (x, 0, X, x)*, (x, X, 0, x)*, (x, X, x, 0)*}. 

Therefore, we can see that there are only 204 vectors with minimum diffusion: 
48 from TTi{Ii) = I^, 108 from TTi{l 2 ) = I 2 and 48 from TTi{I^) = Ii. Observe that 
the nonzero bytes in each input vector should have the same value to achieve 
minimum diffusion. Also note that the 18 values in fiy — can only occur for 
inputs with two separated nonzero bytes (the last two cases in 12 ). 

Now let us examine the diffusion effect of r o tt through consecutive rounds. 
This analysis can be done by assuming that in each round the S-box output 
can take any desired value, irrespective of the input value. This assumption is 
to maximally take into account the probabilistic nature of S-box transformation 
without details of the S-box characteristics. Since it suffices to consider worst- 
case propagations, we only examine inputs with 1, 2, or 3 nonzero bytes in any 
one column vector of a 4 x 4 byte array, say the first byte column. The result 
is shown in Table 2, where we only showed the nonzero column vector in the 
starting 4x4 byte array. The sum of the number of nonzero bytes throughout 
the evolution is of great importance to ensure resistance against differential and 
linear cryptanalysis. Table 2 shows that the number of nonzero bytes per round 
is repeated with period 4 and their sum up to round 8 is at least 32. 



starting nonzero vector \ round 


1 


2 


3 


4 


5 


6 


7 


8 


h, (0 < j < 3) 


1 


3 


9 


3 


1 


3 


9 


3 


h, (0 < J < 5) 


2 


2 


6 


6 


2 


2 


6 


6 


hj (0 < J < 3) 


3 


1 


3 


9 


3 


1 


3 


9 



Table 2. Minimum possible no. of active bytes (without considering S-box char.) 



3.2 S-Boxes Construction and Their Property 



The S-box for a block cipher should be chosen to have two important requi- 
rements: differential uniformity and nonlinearity. Combined with the diffusion 
effect of linear transformations used, they directly determine the security level 
of the block cipher against DC and LC. 

The maximum differential and linear approximation probabilities for an n x n 
S-box S {5s and Xg for short) can be defined as follows. Let X and F be a set 
of possible 2” inputs/outputs of S, respectively. Then Sg and Xg are defined by 



j def ^{x e X\S{x) ® S{x ® Ax) = Ay} 

dg = max , 

Ax^0,Ay 2^ 



def 

As = max 



|#{x G X\x • Fx = S{x) • Ty} — 2" 



( 4 ) 

( 5 ) 
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where a • 5 denotes the parity of bit-wise product of a and b. 

The nonlinear transformation adopted in CRYPTON is byte-wise substituti- 
ons using four 8x8 S-boxes, Si {i = 0,1,2, 3). We first constructed an 8 x 8 
involution S-box S from two d-bit permutations (P-boxes, for short), Pq and 
Pi, using a SP network, as shown in Fig.l. Then the actual four S-boxes were 
derived from S as follows: for each x G [0,256), 

S'o(x) = S'(x)^^, S'i(a:) = S'(a;)^^, S2(x) = S(x^'^), Ss(x) = S(x^^). 

It is easy to see that these S-boxes satisfy inverse relationships such that 
= S2 and = S3. We decided to use four variants of S, rather than 
just one involution S-box or four independent S-boxes, because this will make 
iterative characteristics harder to occur while reducing the storage required in 
some limited computing environments (e.g., low-cost smart cards). 
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Fig.l The selected 8x8 involution S-box S 



The involution S-box S was searched for, over some limited space of good 
d-bit P-boxes and linear involutions, in such a way that it has best possible 
differential and linear characteristics. Moreover, among such a set of candidate 
involution S-boxes, we selected the final S-box S considering the following two 
additional requirements: 

1. The high-probability I/O difference pairs (selection patterns, resp.) in S 
should have as high Hamming weights as possible. 
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2. The number of high-probability difference pairs (selection patterns, resp.) in 
the resulting 8x8 S-boxes S'^’s should be as small as possible when the input 
is restricted to the minimal diffusion set fly. 

These requirements are to ensure that high-probability differences /selection pat- 
terns should be more rapidly diffused by linear transformations and that it should 
be more difficult to form a chain of high-probability S-box characteristics/linear 
approximations through consecutive rounds. 

Table 3 shows their statistics on the distribution of input-output diffe- 
rence/linear approximation pairs, where the entry values are computed by the 
numerator of equations (4) and (5). 
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entry value 
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Table 3. Distribution of difference/linear approx, pairs for Si 



From the table, we can see that for each i, 



def „ 10 

Pd - Si - 



>- 4.68 



def , , 32 2 

PI = As. = (^) = 



■>-4 



and that there are only 7 difference pairs achieving the best characteristic proba- 
bility Pd (6 selection patterns achieving the best linear approximation probability 
Pi). As we aimed at the S-box selection process, these high-probability characte- 
ristics have fairly heavy Hamming weights. E.g., for the characteristics with top 
two high probabilities (69 pairs for DC and 42 pairs for LC), the sum of input 
and output Hamming weights are at least 4 and larger than 8 on average. 
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Table 4. The most probable characteristics over the restricted set fly 



More importantly, if the input is restricted to the minimal diffusion set fly, 
the maximum entry values are at most 6 and 24 for differential and linear charac- 
teristics, respectively. There are only 4 such difference pairs and 1 such selection 
pattern in each S-box, as shown in Table 4. Note that even the pairs in the table 
belong to the more restricted set fly — fix- Since they are more important for 
worst-case analysis of DC and LC, we define these probabilities as 



, d_ef .Qy 

Pd - ^Si 



0-5.42 / de^f .Qy 

^ ^ Pi — -^Si 



= ( 



24 



M28' 



= 2 



- 4.83 
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3.3 Differential Cryptanalysis 

Let us first evaluate the best r-round characteristic probability for CRYPTON. 
In the following we only consider characteristics up to 8 rounds since that will 
be sufficient to show the resistance of CRYPTON to differential cryptanalysis. 

First note that the probability of any characteristic in CRYPTON can be 
completely determined by the number of active S-boxes and their char, probabi- 
lities (e.g., see m)- Since the number of active S-boxes involved in any 8-round 
characteristic is at least 32, we can obtain the most rough upper bound for the 
best 8-round char, probability as pcs = under the assumption of 

independent and uniform distribution for plaintexts and round keys, where we 
assumed that all the S-boxes involved have the best char, probability pd- 

However, as can be seen from Table 4, the minimum number of active S- 
boxes shown in Table 2 can not be achieved even with S-box characteristics 
with probability p'^. Moreover, if we allow intermediate S-box output differences 
with larger diffusion orders, then the number of active S-boxes up to round 8 
will grow much larger than the bound 32. Considering the rapid diffusion by 
linear transformations, we can reasonably assume that a characteristic involving 
a smaller number of active S-boxes with smaller S-box char, probabilities should 
give better overall probability than a characteristic involving a larger number of 
active S-boxes with larger S-box char, probabilities. Therefore, we can obtain a 
tighter bound for the 8-round char, probability as pcg < The 

actual probability will be much lower than this bound, but we do not proceed 
any more since this bound is lower enough to show the strong resistance of 
CRYPTON against DC based on the best characteristic. 

Given a pair of input and output differences, there may be a relatively large 
number of characteristics starting with the input difference and ending with the 
output difference. It is not easy to estimate the number of such characteristics 
that can reside in a differential. However, the estimated 8-round char, proba- 
bility, together with a rough analysis solely based on the diffusion property of 
linear transformations, shows that no 8-round differential can have probability 
significantly larger than 2“^^®. Therefore, we believe that CRYPTON with 9 or 
more rounds is far secure against the basic differential attack. 

3.4 Linear Cryptanalysis 

An r-round linear approximation involves a number of S-box linear approximati- 
ons and, as in differential cryptanalysis, the number of such S-boxes (i.e., active 
S-boxes) determines the complexity of linear cryptanalysis. Much the same way 
as DC, we can obtain a rough bound for the best 8-round linear approximation 
probability pl^ as pl^ < Again this value is a very loose upper 

bound. Actually there will be no linear approximation achieving this probabi- 
lity, considering the linear characteristic of S-boxes and the linear transformation 
involved. 

As in the differential attack, we may use multiple linear approximations to 
improve the basic linear attack jlYlliS) . Suppose that one can derive N linear 
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approximations involving the same key bits with the same probability. Then 
the complexity of a linear attack can be reduced by a factor of N, compared 
to a linear attack based on a single linear approximation ca However, a large 
number of linear approximations involving the same key bits are unlikely to be 
found in most ciphers, in particular in CRYPTON. Multiple linear approximati- 
ons involving different key bits may be used to derive the different key bits in the 
different linear approximations simultaneously with almost the same complexity 
m- However, this will be of little help to improve the basic linear attack, since 
we already have a linear approximation probability far beyond any practical 
attack. Therefore, we believe that there will be no linear attack on CRYPTON 
with 9 or more rounds with a complexity lower than 2^^®. 

3.5 Security against Other Possible Attacks 

There are some variants to the basic differential attack discussed above. Knud- 
sen introduced the idea of a truncated differential HSI, i.e., a differential that 
predicts only part of the difference (not the entire value of difference), and de- 
monstrated that this variant may be more effective against some ciphers than the 
basic differential attack and may be independent of the S-boxes used jlSl. Due 
to the fairly uniform diffusion by bit-wise permutations, we believe that trun- 
cated differentials will not be much useful in CRYPTON compared to ordinary 
differentials. 

The higher order differential attack was first considered by Lai m and furt- 
her investigated in lEEl. Let d be the poly, degree of (r — l)-round output bits 
expressed as polynomials of plaintext bits. Then the higher order DC can find 
some key bits of the last round for an r-round cipher using about 2^^+^ chosen 
plaintexts m- Obviously the success of this attack depends on the nonlinear 
order of S-box outputs. Since CRYPTON uses S-boxes with nonlinear order 6, 
the poly, degree of output bits after 3 rounds increases to 6® » 128. Therefore, 
the higher order DC on CRYPTON will be completely infeasible after 4 rounds. 

There also exist some algebraic attacks using polynomial relations between 
ciphertexts and plaintexts. The interpolation attack m proposed by Jakobsen 
and Knudsen is applicable if the number of coefficients in the polynomial expres- 
sion of the ciphertext is less than the size of ciphertext space. Its probabilistic 
variant allows to use some probabilistic non-linear relations with increased com- 
plexity HH. The S-boxes used in CRYPTON do not allow any simple algebraic 
description and the bit permutation tt in each round further complicates alge- 
braic relations between S-box outputs. We thus believe that this kind of algebraic 
attacks cannot be applied to CRYPTON. 

Another notable attack is the differential attack based on impossible differen- 
tials recently introduced by Biham et al. m It seems not easy to systematically 
find some impossible events in block ciphers based on the SP network. Thus the 
applicability of this attack to CRYPTON should be further investigated in the fu- 
ture. Other variants of differential attacks, such as the differential-linear attack 
P] and the boomerang attack (differential-differential style attack) j'/!4) . don’t 
appear to better work on CRYPTON than the basic differential attack. 
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There are also several variants or generalizations of linear cryptanalysis. 
These include linear cryptanalysis using non-linear approximations m , gene- 
ralized linear cryptanalysis using I/O sums |7j, and partitioning cryptanalysis 
0, etc. We have not checked in detail the effectiveness of these attacks against 
CRYPTON. However, our observation on the diffusion property of tt shows that 
any kind of I/O relations involving more than two bits in S-boxes should rapidly 
increase the number of active S-boxes involved in the overall I/O relations. So, 
we believe that there will be little chance of these attacks substantially improving 
the basic linear attack. 

Finally, we note that there exists a specialized attack to SQUARE-like ci- 
phers, the so-call dedicated SQUARE attack |0|, which can also be applied to a 
reduced variant of CRYPTON (see 0). However, this attack only uses the balan- 
cedness of XOR sum of intermediate round outputs for a set of different inputs, 
so its applicability is limited to at most a 6-round version of CRYPTON. 



3.6 Key Schedule Cryptanalysis 

Key schedule cryptanalysis is another important category of attacks on block 
ciphers. Typical weaknesses exploited in key schedule cryptanalysis include weak 
keys or semi-weak keys, equivalent keys, related keys and simple relations such 
as the complemetation property existing in DES (for details, see e.g. 1 1 .'-ll 4^ V 
These weaknesses can be exploited to speed up an exhaustive key search or to 
mount related key attacks. Though most of these attacks on key schedules are 
not practical in normal use, they may be a serious flaw in certain circumstances 
(e.g., when a block cipher is used as a building block for hash functions). 

The key schedule of CRYPTON is designed with the above known weaknesses 
in mind. First remember the two step generation of round keys in CRYPTON: 
First, a user key of 256 bits or less is transformed into 8 expanded keys via 
invertible nonlinear transformations. Then, the first 4 expanded keys are used 
to generate round keys in even rounds and the remaining 4 in odd rounds. In 
each round, the expanded keys are updated by a word rotation and bit or byte 
rotations and then xored with distinct constants to produce round keys. 

The first step of the key schedule shows that no different user keys can 
produce the same expanded keys and that there is little possibility of simple 
relations between different user keys being preserved in expanded keys. Thus we 
believe that there are no equivalent keys or simple relations in the CRYPTON 
key schedule. It is also very unlikely that there exist related keys that can be 
used to mount related- key differential attacks or related- key slide attacks, since a 
nonlinearly transformed user key is applied 6 or 7 times throughout encryption, 
each time being updated by rotations and constant additions. 

Weak keys or semi-weak keys, if any, are usually due to the symmetry in 
encryption and key scheduling processes. This symmetry can be destroyed most 
easily by using distinct round constants in the key schedule. In CRYPTON we 
used different rotation amounts and round constants for each round key. So, we 
also believe that no such keys exist in CRYPTON. 



A Revised Version of Crypton 



43 



4 Implementation and Efficiency 

The overall structure of CRYPTON allows a very high degree of parallelisms. 
This will result in high efficiency and flexibility in both software and hardware 
implementations . 

The round transformation of CRYPTON can be efficiently implemented on 
a 32-bit microprocessor using table lookups, if we use 4 Kbytes of storage in 
addition. The idea is to precompute and store 4 tables of 256 words as follows: 

= ©Lo('5'*b1 ^ 'm-i+k mod 4 )^®'' for 0 < i < 3 and 0 < j < 255. 

We can then implement the odd round function B = Pok{A) by 

B[j] = mod 4[©j] © K[j\ for 0 < j < 3. 

Similarly, the even round function can be implemented hy B = = 

PoK{{A[l],A[%A[i\,A[2]f). 

We have implemented CRYPTON in C (with in-line assembly in the case 
of Pentium Pro) and measured its speed on 200 MHz Pentium Pro running 
Windows 95 (with 32 Mbytes of RAM) and on 167 MHz UltraSparc running 
Solaris 2.5. The result is shown in Table 5. Our optimized C code runs quite 
fast, giving an encryption rate of about 6.7 Mbytes/sec on Pentium Pro and 
about 4.4 Mbytes/sec on UltraSparc. The partial assembly code on Pentium 
Pro can encrypt/decrypt about 8.0 Mbytes per second, running about 20 % 
faster than the optimized C code. We expect that a fully optimized assembly 
implementation will run a little bit faster. 



LanguageXClocks 


Key setup (enc/dec) 


Encryption 


In-line Asm (PP) 


N/A 


381 


MSVC 5.0 (PP) 


327 / 397 


452 


GNU C (US) 


496 / 564 


575 



Table 5. Speed of CRYPTON on Pentium Pro and UltraSparc (for 128-bit keys) 

The key setup time of CRYPTON is different for encryption and decryption. 
Decryption key setup requires a little more computation due to the need of 
transformation of expanded keys. Our encryption key schedule is very fast, taking 
much less time than one-block encryption (though the code for key scheduling 
was not fully optimized). As a result, CRYPTON will be very efficient for use as 
a building block for hash functions or in the case of encrypting/decrypting only 
a few blocks of data (e.g., MAGs for entity authentication). Note that all the 
timings remain almost the same for different sizes of user keys. 

CRYPTON can be efficiently implemented on other platforms as well. For 
smart card implementations, we can only store 256 bytes of the involution S-box 
S and compute each entry of Si’s using just one rotation. The RAM requirement 
is also very small, just 52 bytes in total (20 bytes for data variables and 32 bytes 
for a user key). So we can expect that CRYPTON will run quite fast on low-cost 
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smart cards, since all computations can be efficiently implemented only using 
byte operations. Also, CRYPTON will be ideal to be implemented on DSPs which 
have multiple execution units due to its high parallelism. 



optimized in 


delay (nsec) 


cycles 


Mbits/sec 


total (cell) area 


Area 


18.97 


7 


900 


51527 (18323) 


Time 


10.24 


7 


1660 


74021 (28180) 



Table 6. Estimated speed of CRYPTON in gate array impl.(from Synopsys) 

Hardware efficiency is one of design objectives of CRYPTON. To estimate the 
speed in hardware, we carried out some simulations with Synonsys using a com- 
mercial 0.35 micron gate array library. The result is shown in Table 6. This table 
shows that we can easily achieve a Giga bits/sec in hardware only using a small 
amount of chip area. 

5 Conclusion 

We described CRYPTON version 1.0, an enhanced version of our AES proposal, 
and analyzed its security and efficiency. CRYPTON was designed by considering 
efficiency in various implementation environments. Its symmetry in encryption 
and decryption greatly reduces the hardware complexity. The S-boxes and li- 
near transformations are designed by considering efficient implementations in 
hardware logic as well. The key scheduling algorithm runs very fast and allows 
efficient implementations in hardware and under limited environments. 

Our preliminary analysis shows that 12-round CRYPTON is far secure against 
most known attacks. At present the best attack on CRYPTON appears to be 
exhaustive key search. However, as usual, more extensive analysis should be 
done before practical applications of a newly introduced cipher, so we strongly 
encourage the reader to further investigate our new version of CRYPTON. We 
would greatly appreciate any reports on its analysis. 
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Abstract. In this paper we present an attack on a reduced round version 
of Crypton. The attack is based on the dedicated Square attack. We 
explain why the attack also works on Crypton and prove that the 
entire 256-bit user key for 6 rounds of Crypton can be recovered with 
a complexity of 2®® encryptions, whereas for Square 2^^ encryptions are 
required to recover the 128-bit user key. 



1 Introduction 

The block cipher Crypton was recently proposed as a candidate algorithm for 
the AES |5|. In this paper we describe a chosen plaintext attack that works if 
the cipher is reduced to 6 rounds instead of the specified 12 rounds. Our attack 
is based on the dedicated Square attack presented in 0, but because of the 
differences between Square and Crypton, the attack has to be modified in 
several points. 

Previous analysis of Crypton led to the discovery of a failure of the key 
scheduling, resulting in a number of weak keys 1 1 f 6j . Our attack works on a 
reduced version of Crypton for all keys. For a final optimisation of the attack, 
we exploit another feature of the key scheduling. 

In Section 0 we give a short description of Crypton. We present the basic 
attack in Section 0 Section 0 discusses the recovery of the user key. Section 0 
and Section 0 discuss the extension of the attack to five rounds, and Section 0 
gives the six round attack. We conclude in Section 0 

2 Description of Crypton 

The block cipher Crypton is based on Square 0. The plaintext data is ordered 
in 16 bytes, which are put in a square scheme, called the state. If A is the state at 
a certain moment, the different bytes of A are called {A)ij with i and j varying 
from 0 to 3 (see Figure 0 )- Crypton uses 6 elementary transformations. 

* F.W.O. Postdoctoral Researcher, sponsored by the Fund for Scientific Research - 
Flanders (Belgium) 



L. Knudsen (Ed.): FSE’99, LNCS 1636, pp. 46-^^ 1999. 
@ Springer- Verlag Berlin Heidelberg 1999 



Attack on Six Rounds of Crypton 



47 



Ao 3 


Ao 2 


Aoi 


Aoo 


Ai 3 


Ai 2 


All 


Aio 


A23 


A22 


A21 


A20 


A33 


A32 


A31 


A30 



Fig. 1. Byte coordinates of the state 



— axi is a key addition (EXOR) with round key i. This operation is the same 
as the key addition in Square. 

— TTe and TTo are linear transformations that act on columns of the state. The 
TT-transformations operate on two bits at a time, calculating a new value by 
exoring the old values of two bits in corresponding positions in three different 
bytes of the column. These operations are the replacement of the MDS- 
based 9 in Square. They can be implemented using four masks, denoted 
Mq, . . . M3. 

— 7e and 70 are non-linear transformations that apply S-boxes to the diffe- 
rent state bytes. They have the additional property that 7e = 7(7^. These 
operations correspond to the single 7 in Square. 

~ T is a simple transposition (upper row becomes rightmost column, lower 
row becomes leftmost column, . . .). This operation is the same as the tt of 
Square. Note that {D,C,B,AY means that A* is the upper row and D* is 
the lower row. 

Throughout this text we use versions of Crypton with less rounds than the 
standard number of rounds which is 12 . The standard version of Crypton is : 

Encrypt = Ye ° PeK^^ ° PoKl^ o • • • o o PoKl o CTro 



with 



Ye — T O TTg O 7” 

PeK — O T O TTg O 7e 
PoK = Ok OTTo°lo 



Since Ye uses no key material, it is easily invertible by a cryptanalyst and there- 
fore we do not consider it in the following text. For example a, five round version 
of Crypton means in this text : 



Encrypts = PoAf ° PeK^ ° PoKl ° PeAJ o PoKl ° Oro. 

Unless stated otherwise the output state of round n is denoted in this paper. 
i?o represents the output after the initial key addition oro. PT represents the 
plaintext and CT represents the ciphertext. So CT = Ye{Ri2)- 
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3 Basic Attack: 4 Rounds of Crypton 



In this section we will explain how the dedicated Square attack PI can be 
modified to attack Crypton. Due to the differences between both algorithms, 
the attack works in a slightly different way. The final 6-round attack allows to 
recover the entire 256-bit user key immediately, using less computer time than 
the equivalent attack on Square, which only recovers a 128-bit key. This makes 
the attack much more significant on Crypton than on Square. 

First we explain the attack on 4 rounds of Crypton and the reason why it 
works. The attack on 4 rounds uses approximately 2® chosen plaintexts and their 
corresponding ciphertexts. We show a way to recover a 128-bit user key without 
using additional plaintexts. In the next Sections a 5-round and a 6-round attack 
on Crypton are described, which require significantly more chosen plaintexts. 

Let a yl-set be a set of 256 states that are all different in some of the (16) 
state bytes (the active) and all equal in the other state bytes (the passive). The 
256 elements of the yl-set are denoted Aa with a varying from 0 to 255. Let A 
be the set of indices of the active bytes (indices varying from 0 to 3). We have: 



Va,6e {0...255} : 



/ {Aa)ij {Ab)ij for {i^j) € A 
( {Aa)ij — {Ab)ij for [i^j) ^ A. 



We start with a Al-set with a single active byte. From the definition it follows 
that this byte will take all 256 possible values {0x00, 0x01, . . ., Oxff} over the 
different states in the Al-set. As a consequence, the A-set is balanced, by which 
we mean: 



255 

0(Aa)zy = 0x00, Vi,j. 

a=0 



This is valid for the one active byte because a — 0x00 and for the fifteen 

passive bytes because ^ = 0x00 if 6 is a constant byte. 

We now investigate the balancedness, the positions and the properties of the 
active bytes through the subsequent transformations. After each transformation 
we have a new set of active bytes, which is called the state scheme. If we say that 
the state scheme is balanced we mean that the transformation of the original 
A-set up to this point is still balanced. It is also important in the state scheme to 
know some properties of each byte e.g. this byte is a constant for each element 
of the A-set, or that byte takes every value of the set {0x00, 0x01, . . . , Oxff} 
over the A-set (mathematically : Ua=o(^“)b ={0x00, 0x01, . . . , Oxff}). 



Round 0 (initial key addition) The state scheme evolution through round 0 
is displayed in Figure El We have a single active byte which takes 256 different 
values over the A-set (full black square in the figure). The 15 other bytes are 
passive and therefore they have a constant value over the A-set. 

The initial key addition cr^o does not change the balancedness of the state 
scheme because the EXOR-sum of 256 times the same key byte cancels out. 
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uk° cannot change the state scheme either because the EXOR-addition with 
a constant byte acts as a simple bijection in the set of all possible byte values 
{0x00, 0x01, ..., Oxff}. Hence, if we have 256 different byte values, the key 
addition will map them on 256 different byte values, and if we have 256 times 
the same byte value, the key addition will obviously map them on 256 times the 
same byte value. 




Round 1 7 o does not change the state scheme nor the balancedness of the 
scheme since 7 o is bijective for each of the bytes. So 256 different byte values 
are mapped onto the same 256 different byte values. After tTq we have one active 
column. The other three columns are still passive. Now we investigate the pro- 
perties of this active column. Therefore we have to take a closer look at the linear 
transformation tTq which acts separately on the four columns. Let A denote the 
input state of tTq and let B denote the output state. Now we can write tTq as 

B = 7To(A) 

3 

{B)ij = ^ (^{A)kj A niod 4 ) ’ 

k—0 

where A is the binary AND-operator and Mi with f = 0, 1, . . . , 3 is a masking 
word, leaving 6 bits of every byte unmasked. For example AIq = 0x3fcff3fc. In 
state A we have one active byte which takes 256 different values over the A-set. 
Now 7To acts on this A-set generating one active column, in which each of the 
four bytes contains 6 bits of the original active byte of state A. Hence each byte 
of the active column of state B must take 64 different values, each occurring 
exactly 4 times. 

Taking the EXOR of 4 times the same byte results in 0x00; state B = 7To(A) 
is still balanced. In Figure Olthe new active bytes, which take 64 different values 
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each occurring 4 times, are displayed with a white + symbol in a black square, r 
is a simple matrix transposition, which only changes the positions of the bytes, 
but does not change their value. Thus, t does not change the balancedness or 
the properties of the state scheme, does not change the balancedness or the 
properties either. 




Round 2 Figure 0 shows the state scheme evolution through round 2. 7 e does 
not change the state scheme since 7 e will map a byte value occurring 4 times 
onto another byte value occurring 4 times. This does not change the EXOR-sum 
of the bytes, which remains 0x00. 

7Te generates 16 active bytes (all bytes of the state are now active). Due to 
the specific structure of 7Te (similar to that of tTo), each byte of the output state 
of 7Te takes n < 64 different values. On a fixed position in the output state of TTg, 
every possible byte value (from 0x00 to Oxff) occurs either zero or a multiple 
of 4 times, due to TTg. On Figure 0 these bytes are shown as a white x symbol 
on a black square. This occurrence in multiples of 4 can be explained with the 
following example. 

Suppose that we are working with nibbles 0. We have a d-set consisting of 16 
nibbles e.g. A= {1110, 1110, 1110, 1110, 1010, 1010, 1010, 1010, 1111, 1111, 
1111, 1111, 0001, 0001, 0001, 0001}. We now have a similar situation as the 
real input d-set of TTg. Now we look at the effect of a binary nibble mask e.g. 
M = 1110 which leaves 3 of the 4 bits unmasked. 



1110 A 1110 = 1110 
1110 A 1110 = 1110 
1110 A 1110 = 1110 
1110 A 1110 = 1110 
1010 A 1110 = 1010 
1010 A 1110 = 1010 
1010 A 1110 = 1010 
1010 A 1110 = 1010 



1111 A 1110 = 1110 
1111 A 1110 = 1110 
1111 A 1110 = 1110 
1111 A 1110 = 1110 
0001 A 1110 = 0000 
0001 A 1110 = 0000 
0001 A 1110 = 0000 
0001 A 1110 = 0000 



The table shows the effect of the bit masking. After the masking A = (1110, 
1110, 1110, 1110, 1010, 1010, 1010, 1010, 1110, 1110, 1110, 1110, 0000, 0000, 
0000, 0000}. Now the value 1110 occurs 8 times, the values 1010 and 0000 both 

^ nibble = four-bit quantity 
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occur 4 times, and the other possible nibble values do not occur at all. We see 
that the values in the resulting A-set occur in multiples of 4. 

If we look at the real TTg then the only thing we do is masking (leaving 
unmasked 6 bits of every byte) and EXOR-additions. We can generalize our 
conclusion from above and say that on every byte position every byte value in 
the output state of tTe occurs in multiples of 4 over the A-set. This leaves the 
state balanced, r and 0^2 do not change the state scheme or the balancedness 
for reasons mentioned before. 




Round 3 7 o does not change the state scheme properties or the balancedness 
since 70 will map a byte value occurring a multiple of 4 times on another byte 
value occurring the same multiple of 4 times. This does not change the EXOR- 
sum of the bytes, which remains 0 x 00 . 

After 7 To all bytes are still active but the different values no longer occur in 
multiples of four. This destroys our structure and limits the power of the whole 
attack. These active bytes are displayed in FigureElas grey squares. Nevertheless 
the scheme is still balanced because of the linearity of tTq. This can be proven 
as follows. Let Aa denote the balanced input state of tTo for the a-th element of 
the T-set, and let Ba denote the corresponding output state. Then we have the 
following: 



Ba = TTo{Aa) 



255 



255 3 



0((Sa)ij) - 0 0 [{{Aa)kj) A 4 

a— 0 k—0 

( 



a=0 



= 0 

fc=0 



\ 



255 



0((^a)fci) 



a=0 



^ ^(i+k) mod 4 



V =0x00 / 



= 0 x 00 . 



r and axs do not change the state scheme properties or the balancedness. 
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Round 4 The output of round 3 is balanced: 

255 

^(-^3)0 = ( 0 ), 
a— 0 

with (0) the all-zero state scheme ((0)^- = 0x00, Vi, j). We drop the index a 
because the following formulae are valid for each element of the d-set. We have 
for the output of round 4: 



^4 — PeK^i^s) 

= (Tk4 0T07re0 7g(i?3). 

Taking the inverse of this formula and using the linearity of TTg and t leads us to 

^3 = 7o (7re(r(i?4)) © 7Te(T(iVg))) . (1) 

Since R 4 is known, we can determine 7Te(T(i?4)) completely. Now we guess one of 
the 16 bytes of 7re(r(iVf)). Using (P) we can calculate R 3 for this byte position 
for all 256 members of our d-set. If the EXOR of all these values is zero then we 
have found a possible value for this key byte. We can do this independently for 
all 16 key bytes. Experiments show that only one or two values are possible for 
each key byte. If we do the same again starting with a different T-set, we can 
find the 128 bits of round key with overwhelming probability. 

4 Calculation of the User Key of 128 Bits 

In this section we show how to calculate the user key of 128 bits when we know 
one round key of 128 bits. In the previous section we have extracted round 
key RTg. The user key can now be calculated by taking into account the key 
expansion: 

(U[3], U[2], U[l], U[0])‘ = (r o o (TP o 7r„)(([/[6], C/[4], C/[2], C/[0])‘) 
(U[7], U[6], U[5], U[4])‘ = (r o 7 g o (7Q o 7Tg)((C/[7], U[5], t/[3], U[l])‘) 

To = Ue[0] © U[l] © U[2] © Ue[3] 

Ti = Ue[4] © U[5] © Ue[6] © Ue[7] 

Eg[i] = Ve[i] © Ti for i = 0, 1, 2, 3 
Ee[i] = Ve\i] © To for i = 4, 5, 6, 7 

with U[i],i = 0,1,..., 7 the user key and Ee[i] the expanded keys. In Pj it 
is stated that if the user key is shorter than 256 bits it must be prepend by 
zero-words e.g. a 64-bit user key means U[i] = 0x00000000 for z > 1. 

We now try to calculate a 128-bit user key given K^. Using appropriate shifts 
and constant additions we can easily find Eg [0] , E^ [1] , Ee [2] and Eg [3] given 
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with the following formulae (see appendices in 0): 

ATe[16] =^;e[0]«®©i?Ci 

K^[n]=E,[l]^^^(BRCo 

ATe[18] = Ee[2]^^^ ® RCx 

ife[19] = ^;e[3]«® © i?Co. 

Here denotes the left-wise bit rotation of a 32-bit word A over i positions. 
The problem to be solved can be stated as follows : Given Eg[i] for 0 < t < 3 
calculate U[i] for 0 < f < 3 knowing that U[j] for 4 < j < 7 are all zero. We 
are able to solve this problem with a byte-wise reconstruction of the unknown 
values To and Ti. 



Rightmost byte of Ti (byte 0) First of all we have to guess byte 0 of Ti. 
This enables us to calculate byte 0 of for i = 0, 1, . . . , 3. Since 

(K[3], He[2], K[l], Ve[0]Y = (r o o ap o 7r„)((C/[6], U[4\, U[2], C/[0])‘), 

we find that 

ap o 7 , o r(Fe[3], K[2], K[l], K[0])‘ = 7ro{{U[6],U[4],U[2],U[0]Y). 

We know the upper row of the left side of this expression. From the right side we 
know something about the structure of 7 To((C/[ 6], t/[4], C7[2], [/[O])*) since U[i] = 
0x0 for j = 4, 5, . . . , 7. This structure is: 
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+ 
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0 
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+ 
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0 
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+ 


+ 


0 


2 


2 


+ 


+ 


0 



The symbols in this scheme denote 2-bit quantities in the total state. The four 
symbols in the same row of a sub-group form together one byte of the state. 
E.g., 2 + + 0 in the top left corner denotes the leftmost byte of C/[0]‘. A 0 or 
a 2 respectively indicate to copy the corresponding two bits of C/[0] or U[2]. A 
+ indicates to write the EXOR of the corresponding two bits of t/[0] and U[2], 
The scheme can be derived by taking into account the different masks used in 
the linear transformation tTq (see 0). 



Byte 1 of Ti This byte can by found by checking the second row of 
7ro((C/[6], t/[4], [7[2], f/[0])‘). The four 2-bit-positions in the scheme where we have 
a + symbol in the upper and the second row must contain the same 2-bit values. 
This results in approximately one possible value for byte 1 of Ti . 
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Byte 2 of Ti This byte can be found by checking the third row of 
7ro((C/[6], t/[4], t/[0])‘). We can calculate in advance 12 2-bit-positions of 

the third row of the scheme since Si 0S2 = S3 with S1S2S3 ^ random permutation 
of the symbols + 0 2. This also results in approximately one possible value for 
byte 2 of Ti . 



Leftmost byte of Ti (byte 3) Since we have the upper three rows of the 
scheme, we can calculate the lower row (using the same formula si © S2 = S 3 ), 
and calculate back to the leftmost column of (14[3], Ve[2], 14[0])*. If we find 

four times the same value for the leftmost byte of Ti by checking the Eg [i] values, 
we have a possible user key. We do not expect that more than one valid user key 
can be found. 



5 Addition of a Fifth Round at the End 

In this section we add a fifth round in the end to the basic attack by guessing 
one column of round key Kg at once. To recover Kg we have to know only one of 
the 16 bytes of 7re(r(i?4)) at a time, so knowledge of one row of B 4 is sufficient. 
To add a fifth round to the attack we use the following formula: 

-R 5 = OTOTTgO 7o(i?4) 

E 4 = 7e (’^o(t(K5)) © 7To(t(K^))) , 

which is valid because of the linearity of tTq and r. 

If we guess a row of tTo(t(K^)) we can calculate a single row of R 4 and a single 
column of 7Te(r(i?4)). Since R 3 = 70 (7Te(r(i?4)) © TTg{T{Kg))'^ must be balanced, 
we can exclude approximately ||| of our 2'*'’ guessed key values (2^^ for the row 
of TTo{T{Kg)) and 2® for the one byte of ng{T{Kg)) gives 2"^° guessed key values). 
This means that we have to repeat this procedure for at least 5 d-sets in order 
to find the round keys Kg and Kg from which we can calculate the entire 256-bit 
user key due to the simple key scheduling mechanism (see appendices in jSj). 

6 Addition of a Fifth Round in the Beginning 

In this section we add a fifth round in the beginning to the basic attack. We try 
to generate d-sets with only one active byte (taking 256 different values) at the 
output of round 1. We start with a pool of 2®^ plaintexts that differ only in the 
byte values of the first column. We assume a value for the 4 bytes of the first 
column of the first roundkey. This enables us to compose a few sets. 

Let A be the desired output state of tTq of the first round and let PT be the 
plaintext state. 



A = {tTo O Jo o ctko){PT) 
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PT = {(Jko O 7e O TTo){A) 

= K° ©7e(7To(A)) 

Since in 7e(7To(A)) only the first column is active, we can reuse the texts of our 
pool for every value of K^. Given a yl-set, we can recover the value of with 
our four round attack on rounds 2, 3, 4 and 5. We repeat the attack several 
times with different yl-sets. If the values suggested for are inconsistent, we 
have made a wrong assumption for the column of K^. With this method we can 
find and hence we can find the full 256 bits of the user key. 

7 6-Round Version of Crypton 

The six round attack is a combination of the two previous extensions of the basic 
4-round attack. Due to the specific generation of round key we can make an 
improvement of 2^® on the dedicated Square attack, and recover the full 256 
key bits. 

We first guess 1 byte column of (2®^ possibilities). For each guess we can 
generate some T-sets at the output of tTq of round 1 with the formula: 

PT = leMA)) © K°. 

Addition of a round at the end requires the knowledge of a row of 7re(r(iFg)). If 
we know a column of iF® then we also know 4 bytes of K^: 

XO = (iFg[3],i^g[2],Kg[l],iFg[0])‘ 

Ke[Q]=E,[Q] 

K,[l]=E,[l] 

K,[2]=E,[2] 
ifg[3] = Fie[3] 

Kl = (iFg[27],iFe[26],iFe[25],iFg[24])‘ 

ATe[24] = Fie[0]«24 © RCi 
Kg [25] = Fig[l]«24 © RCo 2 
Ke[26] = Fig[2]«® © RC\ 

Kg [27] = Kg[3]«® © RCo2 

If we want to know a row of 7rg(r(Kg)) we have to know a column of K® and 
we have to guess only 16 bits instead of the full 32 bits as in the Square attack 
if we choose the right columns of K®. 

This 6-round attack will recover K® and the equivalent K® , but also Kf . From 
these values we can calculate Ee[i] for t = 0, 1, . . . , 7, hence we can calculate the 
entire 256-bit user key. 
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8 Conclusion 

We have described attacks on several reduced round versions of the block cipher 
Crypton. Tabled summarizes the requirements of the attacks. The 5-round (a) 
attack is described in section 0 and the 5-round (b) attack in section El 

In its present form the described attack means no real threat to the full 
12-round version of Crypton. However, after the discovery of weak keys m 
of Crypton, this is the second time that the key scheduling of Crypton is 
brought into discredit. 



Table 1. Requirements for the described attacks on Crypton. 



Attack 


ff Plaintexts Time Memory 


4-round 


2 ’' 


2 '^ 


small 


5-round (a) 


2 “ 


240 


small 


5-round (b) 


232 


240 


232 


6-round 


232 


256 


232 



References 

1. J. Borst, “Weak keys of Crypton,” technical comment submitted to NIST. 

2. J. Daemen, L. Knudsen and V. Rijmen, “The block cipher Square,” Fast Software 
Encryption, LNCS 1267, E. Biham, Ed., Springer- Verlag, 1997, pp. 149-165. 

3. Lim, “CRYPTON : A New 128-bit Block Cipher,” available from E|. 

4. Lim, “Specification and Analysis of Crypton Version 1.0,” FSE ’99, these procee- 
dings. 

5. NIST’s AES home page, http://www.nist.gov/aes. 

6. S. Vaudenay, “Weak keys in Crypton,” announcement on NIST’s electronic AES 
forum, cf. E|. 

A Attack on Six Rounds of Crypton Version 1.0 

In ^ a new version of Crypton is proposed, Crypton version 1.0. We ex- 
plain briefly how to extend our results to version 1.0, which features two major 
changes. 

1. The nonlinear transformations 7 o and 7 e use now two S-boxes instead of 
only one. This doesn’t influence our attack 

2. The key scheduling has been changed, both in the generation of the expanded 
keys and in the generation of the roundkeys from the expanded keys. This 
influences our attack, but we will see that the attack still applies. 
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A. l Round Key Derivation in Version 1.0 

The relation between roundkey 0 and roundkey 6 is very important for the 
complexity of our attack. In the new version this relation is more complex and 
uses a new operation which is defined as a left-wise bit rotation of each 

of the four bytes of the 32-bit word A. The new calculation of roundkey 0 and 
roundkey 6 is: 

Ke[00] = Ee[Q] © 0x09e35996 

Ke[01] = eJ^I] © 0xfcl6ac63 

Ke[02] = E^[2] © 0xl7fd4788 

Ke[03] = E^[i] © 0xc02a905f 

Ke[24] = © 0xa345054a 

Ke[25] = © 0x56b0f0bf 

Ke[26] = 0 0xbd5blb54 

Ke[27] = (i;e[2]«'>6)«® © 0x6a8ccc83 

Notice that if we know one column of (Kg [03], Kg [02], Kg [01], Kg [00])* then we 
know 16 bytes of a certain column of (Kg[27], Kg[26], Kg[25], Kg[24])* because 
of the double occurrence of the operator in the previous table. This is the 
reason why our six-round attack still works on version 1.0 with the gain of 2^® 
time. 

B Calculation of the User Key of 128 Bits 

B. l Generation of the Expanded Key 

We show in this section how we can calculate the 128-bit user key when we know 
one roundkey of 128 bits. In the specifications of Crypton version 1.0 ^ the 
new generation of the expanded keys is as follows. 

Let K = ku-i ■ ■ ■ kiko be a user key of u bytes (u = 0, 1, . . . , 32). We assume 
that K is 256 bits long (by prepending by as many zeros as required). 

1. Split the user key into U and V as: for * = 0, 1, 2, 3, 

U[i] = ksi+eksi+4:ksi+2ksi, V[i] = ksi+rksi+bksi+sksi+i- 

2. Transform U and V using round transformations po and pg, respectively, 
with all-zero round keys : 

U' = Po{U), V' = Pe{V). 

3. Compute 8 expanded keys Kg[*] for encryption as: for i = 0, 1, 2, 3, 

Kg[z] = U'[i] © Ti, Kg[f + 4] = V'[i] © To, 
where Tq = 0-^q U'[i] and U = V'[i]. 

Since we know that U[i] and V[i] are all-zero for f = 2,3 we know the lower 2 
rows of U and V. Since U' = tottoO joiU) we can calculate Ti, Tg and the user 
key by a byte- wise reconstruction of Ti . 
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B.2 Reconstruction of the 128-Bit User Key 

We have '^o ° lo{U) = t(U') with U = (0x0, 0x0, C/[l], C/[0])‘. Let jo{U) = 
(b, a, 1', 0')* with b = 0x8d63blbl and a = 0xbl8d63bl. The a and b values can 
be calculated from the definition of 7o and from the S-boxes Now we try to 
find the unknown values 0 ’ and 1 ’ . 

If we guess byte 0 of Ti (rightmost byte of Ti ) it is possible to calculate the 
upper row of tTq o 7o({7). The structure of this state is: 
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The rows in this scheme are counted from top to bottom starting with row 0. 
The symbols in the scheme denote to copy the corresponding 2-bit values of the 
following 32-bit values: 



0(= (O' © T © a©b) © O', 

T = (O' © T © a©b) © T, 
a= (0'©l'©a©b)©a, 
b= (0'©l'©a©b)©b. 

Since we know the upper row of the scheme (due to our initial guess of byte 0 
of Ti) we can calculate byte 1 of Ti because we can calculate 4 times 2 bits of 
Ti on the positions of the second row of the scheme where we find a a symbol 
(the symbols 1 in figure El , because: 

a= (0'©l'©a©b)©a 
= (O' © 1' © a © b) © b © b © a 
= b © (a © b), 

and we know a © b. Now we can calculate row 1 of our scheme tTq o 7 o(C/) 
completely. 
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Fig. 6. 128-bit user key recovery 



Next we calculate byte 2 of Ti using the positions in row 2 where we find a 
a symbol (the symbols 2 in figure El • We can check the correctness of byte 0 by 
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checking 8 additional symbols in row 2 (the black squares in row 2 in figure EJ 
since we have the formulae: 

a © 1/ © b = (O' © 1' © a © b) © a © (O' © 1' © a © b) © 1' © b 
= (O' © 1' © a© b) © 0' 

= 

j/©^©a= ... 

= b. 

and a and b are known in advance. 

Finally we calculate byte 3 of T\ using the formula sq © si © S2 = S3 with 
S0S1S2S3 a permutation of the symbols { ^, 1(, a, b }. If we obtain four times 
the same value for byte 3 (in each of the four columns) , we have found T\ . If we 
obtain several different values for byte 3 of Ti the initial assumption of byte 0 
of Ti was wrong and we have to continue guessing it. 

If we have found a correct value for T\ then we have found the state U' and U 
completely. So we can calculate Tq = U[i] so we have state V and finally 
state V. V[2] and R[3] should both be 0x00000000. 
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Abstract. DEAL is a DES-based block cipher proposed by Knudsen. 
The block size of DEAL is 128 bits, twice as much as the DES block size. 
The main result of the current paper is a certificational attack on DEAL- 
192, the DEAL variant with a 192-bit key. The attack allows a trade-off 
between the number of plaintext/ciphertext pairs and the time for the 
attacker’s computations. Nevertheless, the DEAL design principle seems 
to be a useful way of doubling the block size of a given block cipher. 



1 Introduction 

The “data encryption standard” (DES) is the world’s most well known 
symmetric cipher. Formally, the standard defines a 64-bit key, but 8 bits 
are defined as “parity bits” and only 56 bits are actually used as the 
encryption key, i.e., the DES key size is 56 bits. Brute-force attacks for 
recovering a key are feasible, today - and considered the only practical 
way of breaking DES. Thus, while the DES itself cannot be considered 
secure, it is still attractive to use it as a component for designing another 
cipher with an increased key size, such as triple DES. A concern both for 
DES and for triple DES is the block size of only 64 bits, which may lead 
to matching ciphertext attacks. 

In PI, Knudsen proposes the r-round Feistel cipher DEAL with a 
block size of 128 bits. It uses DES in the round function and accepts three 
different key sizes, namely 128, 192, and 256 bits. For the first two sizes, 
the author recommends r = 6, for 256 bit keys, the number r of rounds 
should be 8. Depending on the key size, the three variants of DEAL are 
denoted DEAL- 128, DEAL- 192, and DEAL-256. DEAL is suggested as a 
candidate for the NIST AES standard. 

This paper is organised as follows. In Section El a description of DEAL 
itself is given. Section El presents attacks on the six-round version of 
DEAL, and Section 0 deals with further concerns and conclusions. 

* Supported by German Science Eoundation (DEG) grant KR 1521/3-1. 

L. Knudsen (Ed.): FSE’99, LNCS 1636, pp. 60-E5( 1999. 

(c) Springer- Verlag Berlin Heidelberg 1999 
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2 A Description of DEAL 

Next, we describe the block cipher DEAL and the key schedules for 
DEAL-128, DEAL-192, and DEAL-256. 

2.1 The DEAL Core 

A 128-bit plaintext is split up into two halves (a:o,yo) £ ({0,1}^^)^- 
Two consecutive rounds j and j + 1 ol DEAL take the 128-bit block 
G ({0,1}®^)^ and the two round keys Rj and Rj+i as the 
input to compute the output block (xj+i, yj+i) G ({0, 1}®^)^ by 

Xj :=Xj-i, Vj :=yj-i® ERj(xj-i), 

Xj+i := Xj © Er.^^ {yj), and yj+i := yj, 

where © describes the bit-wise xor-operation for 64-bit strings and j is 
odd. By E, we denote the DES encryption function. Two rounds j and 
j + 1 are also described in Eigure ^ 



Xj_i 






E 








Rj+i 


3 - 





■e 






Xj^l 



Vj+i 



Fig. 1. Round j and j + 1 of DEAL, j odd 



Thus, for DEAL-128 and DEAL-192 we need 6 round keys R\, . . . Rq, 
for DEAL 256 we need 8 round keys R\, . . .Rs- Internally, every round 
key is used as a DES key, ignoring the “parity bits” and hence consists of 
56 bits. We need three “key scheduling” algorithms to generate the round 
keys from the given master key of 128, 192, or 256 bits. 

See Eigure 0for a visual description of six rounds of DEAL. 
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Fig. 2. The six-round version of DEAL 



2.2 The DEAL Key Schedule 

The key schedule of DEAL takes s keys K±, . . . , Kg of 64 bits each (s G 
{2, 3, 4}) and returns r round keys i?i, . . . , of 56 bits each (r G {6, 8}). 
The round keys are generated using DES encryption under a fixed DES- 
key Rif (which is i?* = 0123456789abcdef in hexadecimal notation). 1, 
. . . , 4 are four different constant 64-bit strings, where none is 000. . . 0. 
The DEAL-128 round keys are generated from Ki, K 2 as follows: 



:=Ek*(Ai) 

R 2 ■= Er^{K2 0 i?l) 

R^ := {Ki 0 i?2 0 1) 
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Ri := {K2 0 i ?3 ® 2 ) 

R5 ■= Er^ {K i 0 0 3 ) 

Re ■= Er^ {K2 0 i?5 0 4 ) 



The DEAL -192 round keys are generated from K\, K2, and A3 like this: 

:=Er^{Ki) 

R2 ■= Er^{K 2 0 i?i) 

R3 ■= Er^{K 3 0 R2) 

Ri ■= Er^ {Ki 0 i?3 0 1) 

R5 ■= Er^ {K2 0 i?4 0 2) 

Re ■= Er^ (A3 0 As 0 3 ) 



Given Ai, A2, A3, and A4, the DEAL -256 round keys are: 



Ai :=Ar,(Ai) 

R2 ■= Er^{K2 0 Ai) 

R3 ■= Er^{Ks 0 A2) 

A4 := Er^{K4 0 A3) 

As := Er^ {Ki 0 A4 0 3 ) 

Re ■= Er^ ( A 2 0 As 0 2) 
R7 ■= Er^ (A3 0 Ae 0 3 ) 
Rs ■= Er^ ( A4 0 A7 0 4 ) 



The parity bits of the 64 -bit values A* are ignored when A* is used as a 
DES-key (i.e., a DEAL round key), but relevant for computing Aj+i. 

3 Attacking DEAL 

Well known meet-in-the-middle techniques can be used to recover the 
DEAL round keys. This was stressed by Knudsen himself. The six-round 
version of DEAL is vulnerable to a meet-in-the-middle attack requiring 
roughly ( 2 ^®)^ = 2 ^®® encryptions. Eor the eight-round version, the attack 
needs roughly (2®®)^ = 2^^^ encryptions. 
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Thus the theoretical key size of DEAL is approximately no more than 
168 for the six-round version and 224 for the eight-round version. This 
bounds the theoretical key size of DEAL-192 (six rounds) and DEAL-256 
(eight rounds). Due to their memory requirements, these meet-in-the- 
middle techniques are quite unrealistic, though trade-off techniques to 
save storage space at the cost of increased running time are known . 

Note that finding a DEAL-n key by exhaustive key search (by “brute 
force”) takes about c*2”'/2 single DES encryptions with c = 8 for DEAL- 
192, c = 9 for DEAL-128, and c = 11 for DEAL-256. (One can reject 
most combinations of round keys before the last round. I.e., rejecting 
a combination of round keys takes about 5 DES encryptions for the six- 
round variants of DEAL and about 7 DES encryptions for the eight-round 
variant. In the case of DEAL- 192, we need to generate the first two round 
keys at most every 2®^ steps, while the round keys i?s, i? 4 , and R 5 are 
generated every step. This takes about 3 DES encryptions. The reasoning 
for DEAL-128 and DEAL-256 is similar.) 

3.1 A Chosen Ciphertext Attack for the Six- Round Version 

In addition to meet-in-the-middle attacks, Knudsen describes a chosen 
plaintext attack to recover the round keys of the six-round version of 
DEAL. His attack requires about 2^^^ DES-encryptions using roughly 2™ 
chosen plaintexts. This is significantly faster than the average number 
of DES-encryptions needed to break DEAL- 128 by exhaustive key se- 
arch (i.e., 2^^^ encryptions) and greatly faster than the number of DES- 
encryptions to break DEAL-192 (2^®^). Due to the huge amount of chosen 
plaintexts, Knudsen claims that exhaustive key search nevertheless is less 
unrealistic than this attack. 

We will use the same technique, but we are going backwards, i.e., 
our attack is a chosen ciphertext attack to gain information about the 
first round key Ri. Given the first 56-bit round key R\, there are only 2® 
possible choices for the first sub-key K\ of the master key. Knowing the 
last 56-bit round key instead of the first one, is somewhat less helpful for 
the attacker, due to the DEAL key schedule. 

Recall the last five rounds of six-round DEAL. The round keys in use 
are R 2 , ..., Rq, the input is the pair (xi,yi) of 64-bit strings (where 
x\ = xo is the left half of the plaintext and y\ is generated in the first 
round by y\ = yo® E/j^(xi)), and the output is the ciphertext (x6,ye)- 
Consider two input/output pairs ((xi, yi), (xe, ye)) and ((x'^, yj), (xg, yg)) 
with yi = yj, ye = yg, x\ ® x'l = oc = xq ® Xg and a / 0. 
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First, we show that two such input /output pairs cannot both exist: 
Since y\ = y'^ and ye = Vq, we have X 3 © X 3 = a = X 5 © x'^. On the other 
hand, yi = y 2 = y[ = y '2 and x\ ^ x'^, thus ys = 2/2 © Er^{x 2 ) / ^3 and 
hence ER^{ye) / ER^{y'^). If X 3 © Xg = a, then X 5 © Xg = a © ER^{y 3 ) © 
ER^{y'^) / a, in contradiction to the above. 

Next, we exploit their non-existence to narrow down the number of 
possibilities for i?i and hence K^: 

— Choose a fixed value 2/6 £ {0,1}®^ and 2®^ ciphertexts (s,ye), where 
s G {0,1}®^. Let (xo[s], 2/o['S]) denote the corresponding plaintexts. 
Then we expect to find about 2®^ two-sets {s, t} C {0, 1}®^ with s ^ t 
and xo[s] © xo[t] = s © t. 

— Check all possible 56-bit DES keys R, whether 

yo[s] © Er(xo[s]) =yo[t] © ER{xo[t]). 

As we have shown above, this is impossible li R = Ri. Hence, if “=” 
holds, we know R ^ Ri, which limits further key search. If i? 7 ^ i?i, 
the operation Er : {0, 1}®^ can be viewed as a random permutation, 
hence in this case “=” holds with a probability of 2“®^. Given 2®^ such 
two-sets {s,t}, we expect to have reduced the possible choices for Ri 
by 50%. 

At first, we have 2®® choices for Ri, and the attack takes about 
263 ^ 2^5 ^2 = 2^^® DES-encryptions to reduce the number of choices 
for Ri down to 2®®. Repeating the attack, we need another 2^^® DES en- 
cryptions to reduce the number of choices down to 2®^, another 2^^® DES 
encryptions to reduce it to 2®®, . . . , hence we may pin down Ri by doing 
no more than 2^^^ DES encryptions. This leaves open 2® choices for K\. 

Now, the complete master key can easily be recovered by exhaustive 
key search techniques. In the case of DEAL-128, we need 2®^ * 2® = 2^“^ 
trials to find {Ki, K 2 ). Eor DEAL-192, we need 2®^*2®^*2® = 2^®® trials to 
recover (Ki, K 2 , K 3 ). Here though, exhaustive key search is not optimal. 
Since we have found the first round key of a six-round Eeistel cipher, 
recovering the second round key requires a similar attack on a five-round 
Eeistel cipher and hence is even simpler. 

Theoretically, this attack is much better than exhaustive key search, 
and meet-in-the- middle. But due to the huge amount of chosen ciphertexts 
required, it is quite impractical. 

3.2 A Dedicated Attack for DEAL-192 

Next, we describe another chosen ciphertext attack. This attack takes 
more time than the previous one, but only needs chosen ciphertexts 
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(with 0 < r < 32 ) instead of 2 ®^. E.g., for r = 0.5 the keyspace is reduced 
by about 50 %. 

Recall the last four rounds of six-round DEAL, using the round keys 
i?3, i?4, i?5, and i?6- The input to the last four rounds is (x2,y2) £ 
{ 0 , 1 }®^, the output is {xq,iiq) G { 0 , 1 }®^. Consider two input/output 
pairs {{X2,y2),{x6,y6)) and ((x2,y2), (xe,y'e)) with ye = y'e, X2 Q x'2 = 
a = xe Q Xq and a / 0 . (The value 1/2 0 y'2 may be arbitrary.) 

Two such input /output pairs cannot both exist: Since xe ® x'q = a 
and ye = y'e-, we have X5 0 x'5 = a = X4 0 X4, and hence 2/402/47^ 0 . On 
the other hand, since X2 0 x'2 = X3 0 x'3 = a, and X5 0 x'5 = a = X4 0 x'4, 
we need 2/3 0 2/3 = 0 - This is in contradiction to 2/4 = 2/3 and 2/4 = 2/3- 
As above, we exploit the non-existence of such pairs for key recovery 
purposes: 

— Choose a fixed value 2/6 = 2 /e ^ { 0 , and different values 

s G { 0 , 1 }®^, which gives different ciphertexts (s,ye)- Consider 

two such ciphertexts (xe,ye) and (xg,2/e)) with xe / Xg and the cor- 
responding plaintexts (xq,2/o) and (xq,2/q). 

— For all possible 56 -bit round keys R, compute 2/1 = 2/o 0 E/j(xo) and 
y'l = 1/0 0 Enix'e). 

— For all possible 56 -bit round keys S, compute X2 = xi 0 £'5(2/1), and 

x'2 = x'l 0 £5(2/1)- If 

X2 0 x'2 = Xg 0 x'g, 

then the key pair {R, S) can be discarded, i.e., (Ri,R2) / (R,S). 

If (£, S) is the wrong key pair, we expect “=” to hold with a probabi- 
lity of 2“®^. Since we have chosen ciphertexts and thus (^ 2 ) 

PS j 2 PS 2®*^ * sets of exactly two ciphertexts, we ex- 

pect the fraction of the candidates for (i?i, £2) to be discarded to be 
roughly 1 — 2 “^'^, e.g. roughly 50 % for r = 0 . 5 . 

On a first look, the attack seems to require 2 ^^"'''^* 2 ®®=i 2 ®® = 2^^^+"^ single 
encryptions. Actually, if either of the round keys R and S is wrong, we 
expect to find values X2, x'2, xg, and x'g with 

X 2 0 x'2 = X6 0 Xg, 

after considering less than 2^^ plaintext ciphertext pairs, on the average. 
No more encryption operations (or decryption operations) are needed to 
reject the pair (£, S) of round keys. Hence, the expected number of single 
encryptions is below 2^^ * 2^^^ = 2^^®. Since 145 > 128 , this attack is not 
useful for DEAL- 128 . 
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On the other hand, the attack actually is useful for DEAL-192, the 
DEAL- variant with a key size of 192 bit. Note that the attack only narrows 
down the number of 192-bit keys from 2^®^ to about i.e., on the 

average 8 * 2^®^“^'^/2 additional single DES encryption are needed to find 
the correct key by brute force. 

3.3 The Memory Requirements 

For the attack described in Section 13. IL we need to store about 2 ®^ plain- 
texts, each of 128 bit. This requires 2^^ bits of memory - all other memory 
requirements are negligible, compared to this. 

For the attack described in Section the attacker apparently has 
to store all those about 56-bit keys for the first two rounds, which 

are not discarded. The correct 56-bit key pair is not discarded, and can 
be found by further testing. Given a 56-bit key pair (i?, S) G ({0, 1}^®)^, 
the “further testing” can be done by exhaustively searching 2 ®+®+®^ = 2 ®^ 
64-bit key triples corresponding to {R, S) to find the correct one. If all 
280 ]^gy corresponding to the pair {R, S) are wrong, then {R, S) is 

wrong, too. Instead of storing the pairs {R, S) and later do the “further 
testing” , one may test immediately and save the storage space. 

What then dominates the storage space for the attack described in 
Section lO is the necessity to store 2^^^” plaintexts, i.e., 2^®'*'’” bits. The 
attack in Section 13.21 improves on the attack in Section 13.11 both with 
respect to storage space and to the number of chosen ciphertexts. 

Table d shows the requirements for the attack in Section depen- 
ding on the parameter r, i.e., the required number of chosen ciphertexts, 
the approximate number of single DES encryptions, and the approximate 
number of storage bits. 

Table 1. Requirements for the attack from Section 1^21 



parameter 


chosen ciphertexts 


single encryptions 


memory 


T 


282+t 


8 * -k 2^^^ 


2^y+” bit 


0.5 


282.5 


8 * 2iyu 


2^y-5bit 


1 


2^3 


8^2™ 


2^^ bit 


8 


2^d 


8 * 2^'^^ 


2^'' bit 


16 


248 


8 * 2i^y 


2^5 bit 


24 


2^b 


8 * 2^4^ -k 2^^^ 


2^^^ bit 
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3.4 Chosen Plaintext Attacks 

The chosen ciphertext attacks we described in Sections 13. II and target 
on the first round key R\ or the first two round keys R\ and i? 2 - In 
principle, these attacks do not depend on any key schedule but work 
well with DEAL- versions with independent round keys. For reasons of 
symmetry, they can also be run backwards as chosen plaintext attacks, 
then recovering the last round key or the last two round keys. In fact, the 
attack described in Section 13.11 can be viewed as the backward version 
of the chosen plaintext attack on DEAL with independent round keys, 
described by Knudsen p. (The attack in section E21is new, though.) 

The reason why we considered chosen ciphertext attacks instead of the 
possibly more natural chosen plaintext attacks, is the DEAL key schedule. 
It enables a more simple exploitation of knowing the first or the first two 
round keys, than knowing the last or the last two ones. 

4 Final Remarks 

4.1 The Effective Key Size 

From a general point of view, the designer of a new cryptosystem should 
be pessimistic. I.e., when trying to evaluate the effective key size of a 
cipher, known standard techniques (such as meet in the middle) should 
be taken into consideration, even if they appear to be very unrealistic. 
This defines a safety margin, valuable if new and more practical attacks 
are found. 

Thus, we consider the effective key size of DEAL-128 to be 121 bits 
(or less) and the effective key size of DEAL-256 to be no more than 224 
bits. The effective key size of DEAL-192 is 121 bits (or less). In this 
sense, DEAL-192 does not improve on DEAL-128. If a variant of DEAL 
is needed faster than DEAL-256 but with an effective key-size of more 
than 121 bits, a seven-round version of DEAL would be appropriate. 

4.2 The DEAL Key Schedule 

The DEAL key-schedule is based on using slow but powerful encryption 
operations. On the other hand, the first round key R\ does not depend 
on all (master-)sub-keys Ki (neither does R 2 depend on all sub-keys of 
DEAL-192 and DEAL-256, neither does i ?3 depend on all sub-keys of 
DEAL-256). Once we have recovered R\ (and/or R 2 , R 3 ), recovering the 
complete master key is more easy than necessary. Under ideal circum- 
stances, with randomly chosen keys Ki E {0, 1}®^, independently and 
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according to the uniform probability distribution, this is a minor con- 
cern. 

But Vaudenay observed the following: 

If the choice of the keys Ki is restricted, the attacks described in 
this paper can be improved. Think of the bytes of the keys Ki being 
restricted to a set of c < 2^ printable characters. This indicates that 
there are only c® choices for iCj, and choices for a DEAL- 192 key. 
Since the number of choices for R\ is reduced to c® instead of 2^® the 
attack in Section o becomes 2^®/c® times faster. (Note that there is no 
speed-up for corresponding chosen plaintext attack described by Knudsen 
PI, since the number of choices for Rq still is 2^®.) Similarly, the required 
number single encryptions for the attack in Section B.2I is reduced to 
8 * =1= 2-2^ * 2-1 -L 2^3 cl®, instead of 8 * * 2"2^ * 2"i -L 233 22*56_ 

If we think of, say, c = 64 = 2®, the effective key size of DEAL- 192 
is reduced to 144 bit. The attack in Section o would become 2® times 
faster. Depending on r, the speed-up for the attack in Section Ls. 21 would 
be between 2i® and 2^®. 

Also note that if a part of the key is compromised, then the security 
of DEAL depends on which part is compromised. E.g., if 64 bits of a 
DEAL-192 key are compromised, the remaining security should still be 
about the security of DEAL-128. Apparently, this is no problem if the 64 
bits of A3 are compromised. But if the attacker knows Ai, she effectively 
has to attack a five-round variant of DEAL with a 128-bit key, instead of 
something comparable to six-round DEAL- 128. 

Our concern regarding the key schedule could easily be fixed, requiring 
one additional encryption for DEAL-128, two for DEAL-192, and three 
for DEAL-256, applying the the same general design principle all DEAL 
key schedules are based on. For DEAL-128, we propose to use i?2, . . . , 
Rj as the round keys (and hence throwing away R\ after running the key 
schedule), where 

R 7 '■= © .Re © 5), 

where 5 is a constant different from 000 ... 0, 1, 2, 3, and 4. Similar mo- 
difications for DEAL-192 and DEAL-256 are obvious. 

Note that the additional encryption operations require only between 
1/6 and 3/8 of the time for the original key schedule, hence the slow-down 
for the above modification should be acceptable. 
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4.3 Conclusions and Open Problem 

In spite of these concerns, DEAL is a simple but useful way of constructing 
a new block cipher based on another block cipher E, doubling the block 
size. There is no limitation to E=DES. One could as well view DEAL 
as a “mode of operation” of the underlying block cipher E, instead of a 
block cipher of its own right. 

One may reasonably expect DEAL to be significantly more secure 
than its building block DES. It would be interesting to actually prove 
this, assuming the security of the underlying building block. This has 
been done before for cryptographic primitives based on other primitives, 
e.g. for the DES-based 64-bit block cipher DESX |2]. Such a result may 
seen as a justification of the construction’s soundness, though the actual 
security of the construction also depends on the security of the underlying 
block cipher. 
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Abstract. This paper deals with truncated differential cryptanalysis of 
the 128-bit block cipher E2, which is an AES candidate designed and 
submitted by NTT. Our analysis is based on byte characteristics, where 
a difference of two bytes is simply encoded into one bit information “0” 
(the same) or “1” (not the same). Since E2 is a strongly byte-oriented 
algorithm, this bytewise treatment of characteristics greatly simplifies 
a description of its probabilistic behavior and noticeably enables us an 
analysis independent of the structure of its (unique) lookup table. As a 
result, we show a non-trivial seven round byte characteristic, which leads 
to a possible attack of E2 reduced to eight rounds without IT and FT by 
a chosen plaintext scenario. We also show that by a minor modification of 
the byte order of output of the round function — which does not reduce 
the complexity of the algorithm nor violates its design criteria at all — , 
a non-trivial nine round byte characteristic can be established, which 
results in a possible attack of the modified E2 reduced to ten rounds 
without IT and FT, and reduced to nine rounds with IT and FT. Our 
analysis does not have a serious impact on the full E2, since it has twelve 
rounds with IT and FT; however, our results show that the security level 
of the modified version against differential cryptanalysis is lower than 
the designers’ estimation. 



1 Introduction 

E2 [1] is a 128-bit block cipher designed by NTT, which is one of the Dfteen 
candidates in the Drst round of the AES project. Its design criteria are conserva- 
tive, adopting a Feistel network and a looknp table without shift and arithmetic 
operations (except multiplications in the initial transformation IT and the D- 
nal transformation FT). Moreover E2 has a strongly byte-oriented structure; all 
operations used in the data randomization phase are byte table lookups and byte 
xorU except 32-bit mnltiplications in IT and FT, which successfully makes E2 a 
fast software cipher independent of target platforms, e.g. 8-bit microprocessors 
to 64-bit RISC computers. 

This byte-oriented strncture motivates ns to cryptanalyze E2 by initially 
looking at relationship between the nnmber/location of input byte changes and 
that of output byte changes. For instance, one byte change of input of the round 

L. Kiiudseii {Ed.}: FSED39, LNCS 1636, pp. 71-^^ 1999. 
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function always results in Dve or six byte change of its output, which is a part 
of the design criteria of E2. However, when we change plural input bytes, say 
two bytes simultaneously, it is possible that only three output bytes are changed 
with the remaining Dve bytes unchanged. 

Since the round function of E2 has eight input/output bytes, its bytewise 
change pattern can be represented by eight-bit information where a diDerence 
of two bytes is encoded into one bit information ”0” (the same) or ”1” (not 
the same). In this paper we call this change pattern “byte characteristic” or 
simply characteristic. Due to this simpliDcation, it is not hard to create a com- 
plete 256 D 256 characteristic distribution table that exhausts all possibilities of 
input/output byte change patterns of the round function. 

The next step of our analysis is to establish byte characteristics of the whole 
cipher and to Dnd eDective ones. Since the characteristics consist of sixteen 
bits, even a complete search for the best characteristic is possible in terms of 
computational complexity. We have reached an “iterative” byte characteristic 
of E2, which is non-trivial up to seven rounds. This leads to a possible attack 
of E2 reduced to eight rounds without IT and FT using 2^°° chosen plaintext 
message blocks. 

Also, we show that by a minor modiDcation of the byte order of the output 
of the round function — which corresponds to change of BRL function [1] but 
does not reduce the complexity of the algorithm nor violates its design criteria 
at all — , a non-trivial nine round byte characteristic can be obtained, which 
results in a possible attack of the modiDed E2 reduced to ten rounds without IT 
and FT, and reduced to nine rounds with IT and FT using 2®"^ and 2®^ chosen 
plaintext message blocks, respectively. 

It should be pointed out that our analysis does not make use of the structure 
information of the lookup table; that is, our results hold for any (bijective) lookup 
table. However we will eDectively use the fact that E2 has only one lookup table. 
This means that if E2 would have many diDerent tables in its round function, our 
attack could be (sometimes) harder. We will state the reason of this phenomenon 
in a later section. 

Our analysis does not have a serious impact on the full E2, since it has twelve 
rounds with IT and FT; however our results show that the security level of the 
modiDed version against diDerential cryptanalysis is lower than the designersD 
general estimation, which is applicable to both of the real and the modiDed E2. 

2 Preliminaries 

Figure 1 shows the entire structure of E2. Figures 2 to 4 shows its round function, 
initial transformation and Dual transformation, respectively. In these Dgures the 
broken lines show subkey information, where we do not treat its key scheduling 
part. The notations in these Dgures will be used throughout this paper. For the 
exact detail, see [1]. For readersD convenience, we give algebraic description of 
the variable di in the round function in terms of the intermediate values Ci as 
follows; 
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3 Byte Characteristic Distribution of the Round Function 



E2 was designed so that for any one byte change of input of the round function, 
at least Dve output bytes (speciDcally Dve or six bytes) can be changed. For 
instance, it is easy to check that if we change ai, leaving the remaining seven 
bytes unchanged, then gi, 52, 53, 94 , 95 and gj are always changed while the 
remaining two bytes are never changed. 

Clearly this pattern of byte location does not depend on the amount of change 
of ai- We describe this transition rule as follows: 



(10000000)^(11111010) p=l. (1) 

Next, when we change two input bytes of the round function simultaneously, 
it is also easy to see that there are exactly two cases of output byte diDerence 
patterns. For example, when we change oi and 05 simultaneously, if the amount 
of change of ci (D ci) is equal to that of C5 (D C5), then only three bytes 91 , 95 
and gs are changed, otherwise all bytes except ge are changed. Assuming that 
the input value (oi to og) and the amount of change (D oi and D 05) are given 
randomly, the Drst case happens with approximate probability 2®® (the exact 
value is 1/255, but for simplicity we use this approximation throughout this 
paper). The following denotes this transition rule: 

(10001000) — (10001001) p= 2®®, (2) 

(10001000) ^ (11111011) p = 1 ® 2®®. (3) 



Similarly we can apply this notation to an arbitrary number of byte changes. 
Now one of the most useful byte characteristics of the round function of E2 is the 
following “cyclic” one, whose input pattern is the same as output pattern. This 
characteristic takes place when D ci = D C4 = D cg, hence with the probability 



516 . 



(10010100) ^ (10010100) p = 2®1®. 

Also, the following characteristic will be used in a later section: 
(10110000) — (10000010) p = 2®i®. 



(4) 



( 5 ) 
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4 Byte Characteristic Distribution of E2 

Using the cyclic characteristic shown in (4), we can obtain a seven round cha- 
racteristic of E2 (without IT and FT) as shown in Dgure 5. Note that an xor 
operation outside the round function may cancel diDerences; that is 1 D 1 = 0 
with probability 1/255 and 1 D 1 = 1 with probability 254/255. For simplicity 
again, we will regard these probabilities as 2 ®® and 1 , respectively. In Dgnre 5, 
this cancellation happens three times (three bytes) at an xor operation after 
the sixth round function. As a result, the seven ronnd characteristic holds with 
approximate probability (2®i6)5 D 2®^4 = 2 ®!°^ 

This means that when we change the Drst, fonrth and sixth bytes of the 
plaintext block simnltaneously without changing other bytes, then after the se- 
venth round, the probability that the three bytes of the same location change 
and other bytes do not change becomes 2®^°^. On the other hand, if the round 
function is a random function, the same change is expected to appear with pro- 
bability ( 2 ®®)^^ = 2 ®^°"^ again, since the number of unchanged bytes is thirteen. 
Therefore the expected number of “correct pairs” is the same as that of “wrong 
pairs” . 

Now remember that the correct pairs can be detected with probability 2®^°"^ 
under the assumption that the amount of diDerences of the speciDed input bytes 
(D ai, D 04 and D oe in this case) are given randomly. However if we are able to 
give plaintext pairs with non random diDerences in a chosen plaintext scenario, 
this probability may be greater. In fact when we generate input plaintext pairs 
such that the equation D oi = D 04 = D og holds, then the transition probability 
of the second round function jumps to approximately 9. 3D 2 ®^®, not 2 ®^®, which 
is an experimental result. The reason of this increase is based on the fact that 
the following probability is signiDcantly larger than 2 ®® when D x = D i/, while 
it is expected to be 2 ®® in average when D x yf D y. 

P(D X, D y) Probx,y{S{x) D S(x D D x) = S(y) D S(y D D j/)}. ( 6 ) 

The exact probability of P(D x, D y) depends on the structure of the substitution 
table S, but it is easy to prove that for any S, P(D x, D y) is larger than 2 ®® 
when W X = W y. Also it should be pointed out that this phenomenon can be 
utilized in our analysis of E2 because E2 has only one substitution table in its 
round function. If the function S of the left hand side diDers from that of the 
right hand side in the above deDnition, the distribution of P(D x, D y) will be 
“Dat”, independent of D x and D y. 

5 Possible Scenarios of an Attack of E2 

5.1 E2 Reduced to Seven Rounds without IT and FT 

The discussions in the previous section show that for E2 reduced to seven rounds, 
when 2^®^ chosen plaintext pairs such that D Pi = D P 4 = D Pg are given, the 
expected number of ciphertext pairs that have the diDerence pattern (10010100 
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00000000) is 9.3, where Pi, P4 and Pg denote the Drst, the fourth and the sixth 
byte of plaintext, respectively. Note that these plaintext pairs can be obtained 
using 2®^ plaintexts message blocks (97=104-8+1); for instance, they are given 
as the direct product set of all possible 2®^ patterns of the left half of plaintexts 
and arbitrarily chosen 2®® patterns of the right half. 

On the other hand, for a random permutation with the same chosen plaintext 
pairs, the expected number of ciphertext pairs that have the diDerence pattern 
(10010100 00000000) is 1. This leads to the following scenario for distinguishing 
E2 reduced to seven rounds from a random permutation: 

For a given cipher, if the number of eiphertext pairs that have the diU erence 
pattern (10010100 00000000) is equal to or greater than a pre- determined value 
t, regard it as E2 reduced to seven rounds, otherwise regard it as a random per- 
mutation. 

To estimate an appropriate value for t, we need the following lemma, which 
can be easily proven: 

Lemma 1. When a trial where an event occurs with probahility p is carried out 
njp times, assuming p is smQ ciently close to 0 and i is smD ciently .smaller than 
n/p, the probability that the event occurs exactly i times is 

(e®"n*)/z!. (7) 

Using this lemma, we see that t=4 can be adopted in our case; for E2 reduced 
to seven rounds, the probability that the number of ciphertext pairs having the 
diDerence pattern (10010100 00000000) is equal to or greater than four is 98%, 
while for a random permutation, the probability is expected to be 2%. 

5.2 E2 Reduced to Eight Rounds without IT and FT 

By applying again the seven round characteristic to the Drst seven rounds of E2 
reduced to eight rounds without IT and FT, we can narrow down the possibili- 
ties of subkey of the Dual (the eighth) round using the following algorithm: 

For each candidate for subkey of the Unal round, decrypt all ciphertext pairs by 
one round. Then if the number of pairs that have the d,iU erenee pattern ( 1 001 01 00 
00000000) after the seventh round is less than a pre- determined value t, discard 
the candidate as a wrong subkey. 

Now let us use 2^®^ = 8 D 2^®^ chosen plaintext pairs such that D Pi = D P4 = 
D Pg. Then if the candidate is the correct subkey, the expected number of pairs 
that have the diDerence pattern (10010100 00000000) after the seventh round is 
8 D 9.3 = 74.4; however if it is a wrong subkey, the number of pairs is expected 
to be 8. Note that these plaintext pairs can be obtained using 2^®® plaintexts 
message blocks (100=107-8+1). 
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A direct calculation using lemma 1 shows that t — 60, for instance, can suf- 
Dciently narrow down the possibilities of subkey of the Dual round. SpeciDcally, 
for the correct subkey, the probability that the number of pairs having the diDe- 
rence pattern (10010100 00000000) after the seventh round is equal to or greater 
than 60 is 96%, while for the wrong subkey, the probability is expected to be 

2®103 

The straightforward method for realizing the algorithm above requires com- 
plexity more than 2^^®, but by discarding impossible pairs and introducing a 
counting method for the second layer subkey with 2®^ counters, we can reduce 
the complexity to less than 2^^®. 

5.3 E2 with a ModiDed Round Function 

Our computer program has found that the seven round characteristic shown 
in Dgure 5 is the best one in the sense that it attains the maximal number of 
rounds with non-trivial probability. In this subsection, we try to Dnd better cha- 
racteristics by modifying the round function without violating its design criteria. 
Figure 7 is the modiDed round function we propose here. This modiDcation — 
reordering output bytes of the round function, which is called BRL function — 
does not eliminate any original operations nor violates design criteria of E2. 

This modiDed round function has the following “good” characteristics that 
correspond to equations (2) and (5) in the original round function, respectively: 

(10001000) — (10110000) p=2®®, (8) 

(10110000) -> (10001000) p=2®i®. (9) 

Figure 6 shows a nine round characteristic which holds with probability (2®®)^D 
(2®^®)® D 2®^^ = 2®®®"^, while for a random round function the probability is 
expected to be (2®®)^"‘ = 2®^^^, which is signiDcantly smaller. Therefore in a 
similar way to the previous subsection, we can extract subkey information of the 
Dual (the tenth) round of the modiDed E2 reduced to ten rounds without IT and 
FT. 

The number of required plaintext pairs is 2®®®, which can be generated from 
2®"^ plaintext message blocks (94=109-16-1-1). Note that in this case we do not 
have to choose special plaintexts since the probability that correct pairs are de- 
tected is much larger than the probability that wrong pairs appear. An example 
of an appropriate value for t is 20; for the correct subkey of the Dual (the tenth) 
round, the probability that the number of pairs having the diDerence pattern 
(10001000 00000000) after the ninth round is equal to or greater than 20 is 99%, 
while for the wrong subkey, the probability is expected to be 2®®^®. 

Lastly let us consider the modiDed E2 reduced to nine rounds with IT and 
FT. In IT and FT, 32-bit multiplications with subkey are used. However, since 
this multiplication is modulo 2®®, upper 32-bit of the resultant 64-bit informa- 
tion is simply discarded. Hence this multiplication has the following trivial byte 
characteristic: 



( 1000 ) ^ ( 1000 ) 



p=l. 



( 10 ) 
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Fig.7 Modified Round Function (MF) 
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It follows from equation (10) that the characteristic shown in Dgure 6 can skip 
the IT and FT with probability 1. Therefore we have the following characteristic 
connecting a plaintext block and a ciphertext block direct: 

(10001000 00000000) (10001000 00000000 ) ( 11 ) 

This means that in a chosen plaintext scenario, we can distinguish the modi- 
Ded E2 reduced to nine rounds with IT and FT from a random permutation. 
SpeciDcally, create plaintext pairs with the diDerence pattern (10001000 
00000000) from 2®^ plaintext message blocks (91=106-16+1) and encrypt them. 
Then if a ciphertext pair that has the diDerence pattern (10001000 00000000) 
is found, regard it as the modiDed E2 reduced to nine rounds with IT and FT, 
otherwise regard it as a random permutation (t=l). For E2 reduced to nine 
rounds with IT and FT, the probability that at least one ciphertext pair has the 
diDerence pattern (10001000 00000000) is 98%, while for a random permutation, 
the probability is expected to be only 2%. 

6 Discussions and Conclusions 

It is easily seen that the eDectiveness of a byte characteristic can be evaluated 
by e = (hamming weight of the byte diDerence pattern of the ciphertext pair) 
® log 2 (the characteristic probability). If m exceeds 16, the characteristic is not 
applicable to our analysis. 

We wrote a computer program for searching the best byte characteristic of 
the modiDed E2 for all possible (8! = 40320) choices of the BRL function. The 
following is the summary of the search: 



maximal effective number of rounds 
of the best characteristic 


effectiveness 

e 


number of choices of 
the BRL function 


7 


16 


27688 


7 


15 


8760 


7 


14 


976 


9 


15 


2896 



Table 1: The best characteristic and the number of choices of the BRL function 0 

The designers of E2 have conjectured that the best nine round (ordinary bitwise) 
characteristic probability of E2 is much smaller than their evaluation 

methodology does not depend on a choice of BRL function [1] . 

Our analysis shows that for most cases (maximal eDective number of ro- 
unds = 7), including the real E2, this estimation works well. However for the 
remaining 2896 cases, we can explicitly show a nine round bitwise diDerential 
(not characteristic) whose probability is bigger than 2®^^°, which is signiDcantly 
larger than the designersD estimation. This indicates that in a byte-oriented al- 
gorithm, we should be careful of existence of detectable diDerentials with high 
probability. 

^ After the publication of an earlier version of this paper, Shiho Moriai [5] showed a 
better attack of the real E2 based on another seven-round byte characteristic, whose 
effectiveness is 15; we confirmed that this is the real best byte characteristic of E2. 
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On the Decorrelated Fast Cipher (DFC) and Its 
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Abstract. In the first part of this paper the decorrelation theory of 
Vaudenay is analysed. It is shown that the theory behind the propo- 
sed constructions does not guarantee security against state-of-the-art 
differential attacks. In the second part of this paper the proposed De- 
correlated Fast Cipher (DFC), a candidate for the Advanced Encryption 
Standard, is analysed. It is argued that the cipher does not obtain prova- 
ble security against a differential attack. Also, an attack on DFC reduced 
to 6 rounds is given. 



1 Introduction 

In |6I7| a new theory for the construction of secret-key block ciphers is given. The 
notion of decorrelation to the order d is defined. Let C be a block cipher with 
block size m and C* be a randomly chosen permutation in the same message 
space. If C has a d-wise decorrelation equal to that of C* , then an attacker 
who knows at most d — 1 pairs of plaintexts and ciphertexts cannot distinguish 
between C and C* . So, the cipher C is “secure if we use it only d—1 times” |Z| . It 
is further noted that a d-wise decorrelated cipher for d = 2 is secure against both 
a basic linear and a basic differential attack. For the latter, this basic attack is as 
follows. A priori, two values a and b are fixed. Pick two plaintexts of difference a 
and get the corresponding ciphertexts. Repeat a number of times. The attack is 
successful if and only if at least one ciphertext pair with difference b can be found 
in a number of tries that is significantly less than 2"*. Let P{a, b) = Pr(C(A © 
a) = C{X)0)b) denote the probability of the differential with plaintext difference 
a and ciphertext difference b, where the probability is taken over all plaintexts 
X. To measure the security of the constructions against the basic differential 
attack the probabilities of the differentials are averaged over all keys, denoted 
E{P{a, b)). It is then argued that if E{P{a, b)) can be upper bounded sufficiently 
low for all values of a and b, e.g., E{P{a, b)) ~ 2“™, then the differential attack 
will not succeed. 

Also, in 0 two families of ciphers are proposed both with the above proofs 
of security against the basic attacks. 

* F.W.O. postdoctoral researcher, sponsored by the Fund for Scientific Research, Flan- 
ders (Belgium). 
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The Families of Ciphers 

COCONUT: This is a family of ciphers parameterised by {p{x),m), where m is 
the block size and p{x) is an irreducible polynomial of degree m in GF{2)[x\. A 
COCONUT cipher is a product cipher C3 o C2 o Ci , where C\ and C3 are “any 
(possibly weak) ciphers” 0 , and C2 is defined 

C 2 {y) = Ay + B mod p{x), 

where A,B and y are polynomials of degree at most m — 1 in GF{2)[x]. The 
polynomials A and B are secret and act as round keys. Since the COCONUT 
family has “perfect decorrelation” to the order two it is claimed that the ciphers 
are secure against the linear and differential attacks. 

PEANUT: This is a family of Feistel ciphers parameterised by (m, r, d,p), where 
m is the block size (in bits), r is the number of rounds, d is the order of the 
(partial) decorrelation, and p a prime greater than 2"*/^. The round function F 
takes a text string and d subkeys each of length m/2, 

F{x) = g{{ki ■ x‘^~^ + k 2 ■ x‘^~'^ + . . . + kd-i ■ x + kd mod p) mod 2™/^), 

where g is any permutation an mj^ bits. The DEC is a member of this family 
(cf. Section 1^. 

The PEANUT family does not have perfect decorrelation like the COCONUT 
family. This is due to both the use of the Feistel structure and to the round 
functions, which are not perfect decorrelated. The multiplications mod p and 
mod 2 F“G were chosen since they allow for more efficient implementations in 
software as compared to multiplication in GF(2"). The price to pay is that this 
leads to only partial decorrelated functions. However for sufficiently large values 
of r it is shown that the ciphers are secure against the linear and differential 
attacks 0 . 

In the first part of the paper it is shown that the above constructions based on 
the decorrelation theory do not necessarily result in ciphers secure against state- 
of-the-art differential attacks. Example ciphers from both families are shown to 
be weak. In the second part of this paper we analyse the Decorrelated Fast Ci- 
pher (DEC), which was submitted as a candidate for the Advanced Encryption 
Standard (AES). DEC is an 8-round Feistel cipher and member of the PEA- 
NUT family. It is shown that for any fixed key, there exist very high probability 
differentials for the round function. Also, a differential attack is given on DEC 
reduced to 6 rounds. 

2 Analysis of the Constructions 

In this section it will be shown that the constructions in the previous section will 
not resist differential attacks, thereby indicating a weakness of the decorrelation 
theory 0 . 
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When analyzing the resistance of a cipher against differential attacks, one 
often computes the probabilities of differentials over all plaintexts and all keys 
g]. (Also, one distinguishes between characteristics and differentials; we use the 
latter name for both concepts.) For one particular class of iterated ciphers, the 
Markov ciphers, the probabilities of r-round differentials can be computed as 
the product of the probabilities of the involved r one-round differentials under 
the assumption of independent round keys. Moreover, the probabilities are taken 
only over all possible round keys. However, in an attack the encrypted texts are 
typically encrypted under a fixed, but secret, key. To deal with this, one assumes 
that the hypothesis of stochastic equivalence holds. 

Hypothesis 1 (Hypothesis of stochastic equivalence [^ij) For virtually 
all high-prohahility differentials it holds for a substantial fraction of the keys 
that the probability of the differential for the used key is approximately equal to 
the average probability of the differential, when averaged over all keys. 

The main reason for the criticism of the constructions based on the decorrelation 
theory, is that this hypothesis does not hold for the case of the decorrelation 
modules k\x -I- k 2 in GF(2™) nor for multiplication modulo p modulo 2™/^ for 
prime p. 

It is shown in the following that the distributions of differences through the 
“decorrelation modules”, kiX + k 2 , are very key-dependent. When considering 
multiplication in the field GF(2™) with exclusive-or as the difference operation, 
for any given input difference a 0 and output difference 6, the probability of 
the differential P(a, b) (notation from previous section) for a fixed key, is either 
0 or 1. To see this, let x and a; -I- a be two inputs to the module. The difference 
in the outputs then is, kix + k 2 + ki{x + a) + k 2 = aki. So, although E{P{a, b)) 
(the average probability taken over all values of the key) can be upper bounded 
sufficiently low, in an attack one fixed key is used, and differentials of probability 
0 and 1 can be found and exploited. 

Note that the proof of security against the basic differential attack of the 
introduction is not affected by these observations. Assume that P{a, b) ~ 2“™ 
for an m-bit block cipher (notation as in the introduction). If the attacker is 
restricted to choose the values in the differentials before analysing the received 
ciphertexts the proof of security holds. However, this is not a realistic restriction 
in our opinion. If for every fixed key there are high probability differentials, an 
attacker will be able to detect this in an attack and exploit it. 

Gonsider the GOGONUT family. In |Z] it is shown that C will be secure 
against the basic differential attack independently of the choices of the ciphers 
Cl and C3. First note that GOGONUT versions where Ci = C3 = id (the identity 
function) have high probability differentials for any fixed key. Also, such ciphers 
are easily broken using two known plaintexts. One simply solves two equations 
in two unknowns and retrieves A and B. (This is also noted in 0.) However, 
we argue based on the above discussion that if a GOGONUT cipher is secure 
against a (state-of-the-art) differential attack for a fixed key then it is because 
at least one or both of Ci and C3 contribute to this security. 
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In 0 Wagner cryptanalyses COCONUT’98, a member of the COCONUT 
family. The attack is a differential attack, which exploits that high probability 
differentials exist for both Ci and C3. 

Consider next a variant of the PEANUT family of ciphers for d = 2, which 
use multiplication in GF{2'^/'^) in the round function and let g be any affine 
mapping in GF{2^^'^). The reason goes along the same lines as the reasoning 
of the claim for COCONUT. All differentials through the decorrelation modules 
have probabilities either 0 or 1, and the same holds for differentials through the 
round function, since g is affine. Consequently, since this holds for all round 
functions, there are differentials of probability 0 and 1 for the whole cipher. 

Consider now the PEANUT family. The multiplications mod p mod 2™/^ 
were chosen since they allow for more efficient implementations in software as 
compared to multiplication in GF{2^). Consider constructions for d = 2 with 
multiplication defined in GF(p), for prime p > 2"*/^, where the Feistel round 
function is 

F{x) = g{{ki ■ X + k 2 mod p) mod 2™/^) 

for any permutation g. Let first g be the identity function and let p = 2"*/^ + 1, 
where t is small. Let the difference between two 77i/2-bit texts, x\ and X 2 , be 
defined as d{xi^X 2 ) = x\ — X 2 modp (subtraction modulo p). In the following 
it is examined how such a difference distributes through F. First, note that 
for randomly chosen y, where 0 < y < p, it holds that (y mod p) mod 2™/^ = 
y mod p with probability p\ = 2™/^/ (2"*/^ + t) Ri 1. So, 

d{F{x\),F{x 2 )) = d{k\ ■ x\ + k 2 mod p, k\ ■ X 2 + fe mod p) 

with probability at least (pi)^. But since the multiplication modulo p is linear 
with respect to the defined difference, one gets that 

d{F{x\),F{x 2 )) = ki{x\ — X 2 ) mod p 

with probability at least (pi)^. The halves in the Feistel cipher are combined 
using the exclusive-or operation, however it is also noted in [3 Th. 9] that the 
proof of security for the construction remains valid if the group operation is 
replaced by any other group operation. Assume therefore that the halves are 
combined using addition modulo 2"*/^. Let Wi and W 2 be the two 77i/2-bit halves 
from a previous round which are combined with the outputs F{x\) and F{x 2 ) 
of the current round. Assume d{wi,W 2 ) = /?. Then with probability 1/2, wi + 
F{xi) mod 2™/^ = ici + F(cci) in Z, thus if d{F{xi), F{x 2 )) = a, then 

d{F{xi) + wi mod 2™/^, F{x 2 ) + W 2 mod 2™/^) = a + /? 
with probability 1/4. 

To sum up, differences modulo p in the round functions of PEANUT distri- 
bute non-uniformly. For any fixed round key, a given difference in the inputs to 
F results in differences in the outputs of F with very high probabilities. Above 
it was assumed that g was the identity function. The point which is made here is 
that if members of the PEANUT family are secure against differential attacks. 
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then it is because g resists the differential attacks, and not because of the decor- 
relation module by themselves. In the next section a particular member of the 
PEANUT family is analysed, where the function g is not the identity function. 

Note that the high probability differentials described in this section are key- 
dependent and therefore unknown to an attacker. However, the fact that the key 
is fixed in an attack means, that the high probability differentials will occur. 
This can be exploited in a standard attack, e.g., if the attacker guesses the first- 
round key and/or the last-round key, the keys which produce high probability 
differentials for the reduced cipher will be good candidates for the correct value 
of the key(s). Furthermore, also differentials with probability significantly below 
2“™ can be used in a differential attack. This is illustrated by the attack we 
present in Section O 

3 The Decorrelated Fast Cipher 

The Decorrelated Fast Cipher (DFC) |2| has been submitted as a candidate for 
the AES encryption standard |5|. DFC is a member of the PEANUT family, 
described above. In the following a more precise definition of the DFC is given. 
For a complete description of DFC the reader is referred to |2| . 

3.1 General Structure 

DFC is a block cipher with the classical Feistel structure. It uses 8 rounds to 
transform a 128-bit plaintext block into a 128-bit ciphertext block, under the 
influence of a key that can have a length up to 256 bits. The user key is expanded 
to 8 128-bit round keys Ki. Every round key is split into two 64-bit halves, 
denoted Ai and Bi. In every round, the round function uses the right half of the 
text input and the two 64-bit round key halves to produce a 64-bit output. This 
output is exored with the left half of the text input. Subsequently, both halves 
are swapped, except in the last round. 

3.2 The Rouud Fuuctiou 

Let X denote the 64-bit text input. First a modular multiplication is performed, 
followed by an additional reduction. 

Z = {A,- X + B, mod (2®^ -h 13)) mod 2®^ (1) 

Subsequently, the ‘confusion permutation’ is applied to Z\ the value is split into 
two 32-bit halves, denoted Zi and Zr- Zi is exored with a constant KC . Zr is 
exored with a table entry that is determined by the 6 least significant bits of 
the original Zi. Both halves are swapped, and the result is added with a 64-bit 
constant KD. 

Y = {{Zr © RT[Zi mod 64]) < 32) + (Z, © KC) + KD mod 2®^ 

The result Y is the output of the round function. 
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3.3 The Key Scheduling 

The key scheduling first pads the user key with a constant string until a 256-bit 
string K is obtained. Subsequently, K is divided into two 128-bit parts Ki and 
K 2 - The keys K\ and K 2 define each an invertible transformation, denoted i?i() 
and E 2 O respectively. Let RKq denote a string of 128 zero bits. Then the round 
keys of DFC are defined as follows. 

RK, = Ei{RK,_i) if i is odd (2) 

RKi = E 2 {RKi-i) if i is even (3) 



4 The Distribution of Differences in DFC 

First note that since DFC is a member of the PEANUT family, versions of DFC 
which use only the decorrelation modules in the round function have very high 
probability differentials. However, the round function of DFC is more than that. 
To measure the distribution of differences through the round function of DFC 
we first consider a simplified version. First change all exors to additions modulo 
2®^ and remove the nonlinear S-box RT. This version is hereafter called DFC’. 
The swapping of the 32-bit halves inside the E-function is retained. Note that 
the proof of security for DFC’ is the same as for DFC. Consider one round of 
DFC’. Define the difference of two 64-bit texts as the subtraction modulo p. The 
following test was implemented. Randomly choose a difference (q;l,Q!_r), where 
both a’s are 64 bits, in the inputs to one round. Randomly choose a pair w\, W 2 
of texts with difference ol in the left halves. Randomly choose a pair of round 
keys. For n random choices of xi compute the differences of the outputs yi and 
j /2 of the function E for inputs xi and X 2 = (xi — a^) mod p. Compute and 
store the differences of yi + Wi mod 2®^ and j /2 + W 2 mod 2®^. Since modulo 2®^ 
operations used to combine the halves in DFC’ are not completely compatible 
with modulo p = 2®^ -|- 13 operations, differentials are examined for the addition 
of the Feistel cipher halves in addition to the round function F . 

It is infeasible to do tests for all 2®^ inputs, but as we will see, this is not ne- 
cessary in order to determine the distribution of the differences. In 10 tests with 
n = 10, 000 input pairs, the number of possible differences in the outputs and 
the probabilities of the highest one-round differential were recorded. The 10,000 
pairs of inputs lead to only an average of 13.6 possible output differences. The 
average probability of the best one-round differential was 3/8. In similar tests 
with 1,000,000 pairs, the average number of possible output differences was 14.0 
still with an average probability of 3/8. Thus it can be expected that the corre- 
sponding numbers for all possible inputs are close to these estimates. Note also, 
these tests were performed for one randomly chosen input difference, thus, by 
considering many (all) possible input differences higher probability differentials 
can be expected. 

Thus, despite the fact that the round function is almost perfectly decorrela- 
ted, very high probability differentials exist for any fixed key. 
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Table 1 . Results of the experiments on simplified versions of DFC. Probabilities of 
best differentials for a randomly chosen input difference in 10 tests with randomly 
chosen keys. (*) Average in 10 tests of best differential for 100 randomly chosen input 
differences. 





max. probability 


# output dill. 


DFC’ 


3/8 


14 


DFC” 


1/128 


808 


DFC” (*) 


0.37 


370 



Consider next a version of DFC where all exors are replaced by additions 
modulo 2®"^, but where RT is unchanged, hereafter called DFC”. Note that the 
proof of security for DFC” is the same as for DFC. 

In 10 tests similar to the above with n = 10,000 input pairs, the number of 
possible differences in the outputs was an average of 715 and the probabilities 
of the highest one-round differential were 1/100 on the average. In similar tests 
with 1,000,000 pairs, the average number of possible output differences was 808 
with an average probability of 1/128. This is no surprise, since when the 6 bits 
input to RT in a differential are different, the outputs of the round will look 
random in one half. Since these 6 bits are equal with probability 1/64, these 
results are in correspondence with the test results on DFC’. Moreover, for a 
fixed key there are input differences such that the 6 bits input to RT are equal 
in more than the average case, and the probability of the differential will be 
higher. To test this phenomenon, we implemented some further tests. In 10 tests 
a randomly chosen key was used. In each tests for each of 100 randomly chosen 
input differences, 100,000 input pairs were generated and the output differences 
recorded. The probabilities of the best such differentials for the 10 keys ranged 
from 1/22 to 3/5 with an average of 0.37 and 370 possible output differences. 
Table Q] summarizes the results of the experiments. 

Since the only difference between the round functions of DFC” and DFC 
is the use of three additions mod 2®'^ instead of three exors, it has been clearly 
demonstrated that if DFC for a fixed key is secure against differential attacks it is 
because of the use of mixed group operations and not because of the decorrelation 
modules. 

Estimating the uniformity of differences and computing the probabilities of 
differentials are much harder for real DFC. To get an indication of such results, a 
version of DFC with 32-bit blocks was implemented, hereafter denoted DFC 32 . 
The round function takes as input a 16-bit block, uses multiplication modulo 
the prime p = 2^® -|- 3 followed by a reduction modulo 2^®. The RT-table has 
16 entries (the size of the table is chosen as the size of the inputs (in bits) to 
the round function, in the spirit of DFC) with randomly chosen values, and the 
constants KC and KD were chosen at random. 

In 100 tests, the number of possible differences in the outputs and the pro- 
babilities of the highest one-round differential were recorded for one randomly 
chosen input difference and for all 2^® inputs. 
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Table 2. Results of the experiments on a scaled-down version of DFC. Probabilities 
of best differentials for a randomly chosen input difference in 100 tests with randomly 
chosen keys. (*) Average in 100 tests of best differential for 100 randomly chosen input 
differences. 





max. probability 


^ output diff. 


DFC32 


1/397 


6700 


DFC32 (*) 


1/91 


1750 



The 2^® pairs of inputs lead to an average of 6700 possible output differences. 
The average probability of the best one-round differential was 1/397 (and 1/21 
in the best case). By considering 100 input differences for every chosen key, the 
number of possible output differences dropped to 1750, and the average best 
probability increased to 1/91 (and 1/18 in the best case). 

Table 0 summarizes the results of the experiments. 

It can be argued that the RT-table chosen in this scaled-down version of DFC 
is too big relatively to DFC. Repeating the last test above, this time with a 2-bit 
RT-table, the number of possible output differences dropped to 1051, and the 
average best probability increased to 1/49 (and 1/7 in the best case). 

Based on the tests conducted here, it is hard to estimate the exact effect 
for the proposed DFC (without any modifications). However, the tests strongly 
indicate that the round function of DFC distributes differences modulo p in a 
very non-uniform way, and that high probability differentials exist. 

Summarizing, it was demonstrated that if the DFC is secure against the 
differential attacks it will be because of the elements that are independent of the 
proof of security. Also, it was clearly indicated that high probability differentials 
will exist for DFC for any fixed key. 

5 A Differential Attack 

The high probability differentials of the previous section might lead to a straight- 
forward differential attack on DFC. However, the large block size of DFC makes 
it hard to perform such tests. It is left as an open problem for the time being. 

In the following we present an attack on DFC when reduced to six of the 
proposed eight rounds. The attack does not depend directly on the findings 
in the previous version, but these are incorporated in a possible improvement 
described later. The attack uses a differential with S/N-ratio < 1. As explained 
in fP and j^, this type of differentials can be used in a similar way as ‘ordinary’ 
differentials with S/N-ratio > 1 to mount a differential attack. Before the attack 
is explained, we mention a property of the DFC key schedule that is useful in 
the attack. 

5.1 A Key Scheduling Weakness 

The first round key is defined as RKi = Ei{RKq). The string RKq is constant 
and the transformation Ai() depends on one half of the key bits. The consequence 
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is that the entropy of the first round key is at most one half of the entropy of 
the user key, e.g., for a 128-bit user key, the first round key has only an entropy 
of 64 bits. This property makes it easier for an attacker to bypass the first round 
by guessing the key. 

5.2 The i^-Punction Is Almost Bijective 

The F-function of DFC is almost bijective. The only non-invertible part of the 
F-function is the reduction modulo 2®^ after the modular multiplication in (Ill- 
Let Xi,X 2 be two different inputs and let yi = A - Xi + where A and B are the 
secret keys. The inputs X\,X 2 will be mapped to the same output if and only if 

(yi mod (2®"^ -|- 13)) mod 2®"^ = (?/2 mod (2®"* -|- 13)) mod 2®^. 

If xi yf X2, the equality can only hold if either yi mod (2®"^ -|- 13) G {0,1,..., 12} 
and j/2 = yi + 2®^, or yi mod (2®^ -|- 13) G (2®'^, 2®"^ + 1, . . . ,2®^ + 12} and 
2/2 = 2/!— 2®^. For fixed values of A and B, there can be at most 26 tuples 
(xi,X2) with 0 < Xi,X2 < 2®"*’, that result in equal output values. 

It follows that for any key K = (A, B) 

^F(a^0|F)<26-2-®4. 

a^O 

Because for every value of a, P{a — >■ 0|F) is a multiple of 2“®^, there are for 
every round key value K at most 26 a’s such that the probability is larger than 
zero. 

5.3 A 5-Round Differential with Low Probability 

Consider the 5-round differential with both input difference and output difference 
equal to (a, 0), where a is an arbitrary value, different from zero. (We use the 
bitwise exor as difference operation.) In this section we will try to give an upper 
bound for the probability of this differential. In order for our attack to work, 
this upper bound should be significantly smaller than 2“®^. 

On Figure Eit is easy to see that a pair that follows the differential, will have 
a difference of (0, a) at the input of the second round, and (a, 0) at the output of 
the fourth round. In the second round, the input difference to the F-function will 
lead to a certain output difference, denoted p. Similarly, reasoning backwards, 
it follows that in the fourth round, the difference at the input of the F-function 
equals a. The output difference is denoted 7. It follows that the third round 
will have input difference (a,/3) and output difference (7, a). This requires that 
/3 = 7 and that the output difference of the F-function in the third round is zero 
and the input difference /?. 

Note that the differential does not specify any particular value of /3. The 
probability of the differential is thus given by the sum over all /3-values of the 
probabilities of the characteristics. 

-^dif “ ^ -^char(/3) 

13 
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a 0 




0 



Fig. 1. A 5-round differential with very low probability. 



We will approximate the probabilities of the characteristics by the product of 
the probabilities of the composing one-round characteristics. This may in fact 
cause inaccuracies, but it is a common assumption. The exact probability of the 
differential is not very important for our attack, and the experimental results 
confirm our analysis. We feel that a more exact calculation of the probability 
would needlessly complicate the analysis. 

As already explained in Section El and Section El the probability of a one 
round characteristic depends heavily on the value of the round key. 

^dif « ^ ^ /3 I if = if2)P(/3 ^ 0 I if = K^)P{a ^(3\K = Ki) (4) 

0 

When calculating the probability of a characteristic, a distinction is made 
between the cases /3 = 0 and /3 yf 0. If /3 = 0 

W(/3=o) =P{a^Q\K = K^)P{a ^Q\K = K^) 

Under the assumption of independent rounds, it follows from Section 1.5.21 that 
^char(/3=o) < (26/2®")^ < 

If /3yf 0 



^char(/ 3 ) =P{a^P\K = K 2 )P{(i ^ 0 | if = if3)P(a ^ /? | if = if4). 
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It follows from Section 15.21 that for every value of there are at most 26 
/3-values such that the probability is larger than zero. Also, from Section 01 it 
follows that for every value of the round key, there are (a,/3) tuples such that 
P{a — )> (3) is relatively “high”. Let us denote this probability by pi. For most 
other values of (a,/3), the probability is much lower than pi, possibly zero. We 
denote this probability by p 2 - The values of (a, /3) which correspond to high and 
low probabilities depend on the value of the round keys. The worst scenario for 
the attack would be that the round key values are selected such that there are 
values of a and (3 with high P{a — >■ (3) in both the second and the fourth round, 
and where also P{(3 — >■ 0) is nonzero in the third round. The attack uses many 
different a values, therefore it can be expected that some of the a values will 
give a high P{a — ^ (3) in the second or the fourth round, where f3 has nonzero 
P{(3 — >■ 0) in the third round. However, it can be argued that for almost all 
keys it is highly unlikely that there will exist an a such that P{a — >■ (3) is high 
in the second and the fourth round for a suitable /3. For almost all keys, the 
probability of the differential will be at most 26 • 2“®"* ■ Pi- P 2 for all values of a. 
It is confirmed by the experiments performed, that this probability is sufficiently 
low for the attack to work. 



5.4 The Actual Attack 

The attack on 6 rounds works in almost the same way as the attack on 6-round 
DEAL 13 . The main differences are the following: 

1. The probability of the 5-round differential is not 0, but very low. 

2. The attack uses chosen ciphertexts instead of chosen plaintexts. This reason 
for this is that if the user key length is less than 256 bits, the first-round key 
of DFC has a lower entropy than the last-round key. It is therefore easier to 
recover. 

The attack starts as follows. Choose 2®^ ciphertexts with a fixed right half and 
a variable left half, say Cj = (Aj,i?). Obtain the corresponding plaintexts, say 
Pi = {Yi, Zi). Compute Xi © Zi and find matches Xi (B Zi = Xj © Zj. About 
2®® matches can be expected. Let a = Xi (B Xj = Zi (B Zj. Guess a value 
for the first-round key. For all the matching pairs, encrypt the plaintexts one 
round. If the difference after the first round is (a,0), the guessed key value is 
wrong with high probability. For the correct value of the first-round key, in some 
rare cases the probability of getting right pairs might be relatively high, but as 
explained earlier in by far the most cases this ratio is very low. Assuming that a 
wrong key produces uniformly distributed output differences, the difference (a, 0) 
will occur with probability 2“®^ for each analysed pair. Thus, 2“®^ • 2®® = 0.5 
good pairs can be expected. Discarding all the key guesses that produce a good 
pair will eliminate about half of the wrong key values. Repeating this attack 
64 times eliminates almost all the wrong key values. The attack requires about 
64-2®^ = 2™ chosen ciphertexts and (2®'^+2®® + . . . + 2+l)-2®^ m 2^^® evaluations 
of the round function, which is roughly equivalent to 2^®® encryptions. 
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Table 3. Experimental results for DFC 32 , reduced to 6 rounds. All results are averages 
over 10 tests. For every structure, 2*^® ciphertexts with a fixed right half are decrypted. 
The last column lists the average number of surviving wrong key guesses after all the 
runs. (The attack starts with 2^® possible values for the key.) 



^ structures 


^ wrong keys remaining 


16 


22.4 


17 


14.3 


18 


8.1 


19 


4.3 


20 


2.1 


21 


1.2 



Table 4. Adjusted chosen text requirements and work load of the attack on 6 rounds 
of DFC. 



key length 


# chosen texts 


work load 


128 


2^1 


2127 


192 


1.5 ■ 2’’! 


2158 


256 


272 


2190 



5.5 Implementation 

We implemented this attack on DFC 32 , a reduced version of DFC that operates 
on blocks of 32 bits. All constants and tables were reduced accordingly and the 
prime 2^® + 3 was chosen for the modular multiplication. Because of the key 
scheduling weakness (cf. Section FTTll . the first-round key of DFC 32 has 16 bits 
entropy. The results of the previous section predict that 16 structures of 2^® 
texts should allow to determine uniquely the 16-bit key. The results of 10 tests 
are given in Table 0 

After repeating the basic attack sufficiently many times only a few candida- 
tes are left for the secret key. The correct value of the secret key never resulted 
in the 5-round differential as described above, which justifies the approximati- 
ons made in Section 15. 3L The experiments suggest that in practice a little more 
chosen plaintexts are required then predicted by the theory, because we get less 
good pairs than expected for the wrong key guesses. The net result is that every 
structure eliminates only 39% of the remaining key candidates, instead of the 
expected 50%. We therefore have to adjust our estimates for the plaintext re- 
quirements and the work load of our attack on 6 rounds of DFC. Increasing the 
plaintext requirements and the work load with a factor two is more than suffi- 
cient. The results are given in Table together with the estimates for attacks 
on DFC with other key lengths. The estimates for the work load use one encryp- 
tion as the unit operation. The estimate for 256 bit keys is pessimistic, because 
in that case it is easy to speed up significantly the round function evaluations 
since the modular multiplication does not have to be repeated for every new key 
guess. 
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5.6 A Possible Improvement 

In this section a possible improvement of the attack is outlined which may reduce 
the complexity. The attack uses the observations in Section 0 

Consider again the 5-round differential of the previous section, but allow 
now the nonzero half of the difference in the last round to be different from 
the nonzero half of the difference in the first round. The differential starts with 
a difference (a, 0). After the first round, the difference is (0,a). In the second 
round, the inputs to F with difference a leads to an output difference (3. At 
the output of the 5th round, a difference (<5, 0) is required. In the fourth round 
the inputs to F with difference 5 leads to an output difference 7 . It follows that 
/3 = 7 . The attack works as follows. Choose a structure of ciphertexts with equal 
right halves. For all values of the first-round key(s) Ki, find the plaintext pairs 
which yield equal inputs to F in the second round. For each such pair, one knows 
the input differences to F in the second and fourth rounds, and that the two 
output differences of F in the two rounds are equal. Since the number of output 
differences of F are limited for any given difference in the inputs, cf. Section 21 
this gives an attacker information about the relation between one half of each of 
the keys K 2 and A 4 . Note that the distribution of differences through F depends 
mostly on one half of the round keys. Repeat this procedure a number of times 
for each value of the first-round key. The value of the first-round key which gives 
rise to a frequent relation between the keys and A 4 is expected to be the 
correct value. It remains an open problem to what extent this variant of the 
attack will reduce the complexity of the previous attack. 

6 Conclusions 

We showed that the constructions of ciphers in the COCONUT and PEANUT 
families are weak against differential attacks. The main observation is that for a 
fixed key (which is the scenario of an attack) high probability differentials can be 
found. We analysed one particular member of the PEANUT-family: the DFC. 
It was shown that variants of DFC with only small modifications and with the 
same proof of security as the original, are vulnerable to a differential attack. For 
the proposed DFC it was indicated that differentials with high probabilities exist 
for the round function. Also, an attack, not directly related to these findings, 
on the proposed DFC reduced to six of eight rounds was given. Although the 
attack requires a large running time it is believed that the outlined possible 
improvement will be faster. 

The results in this paper do not contradict the theory of decorrelation 0. 
More specifically, and in accordance with [Z), ciphers which are d-wise decorre- 
lated are provably secure against the following attacks. 

1. Any chosen plaintext attack using at most d — 1 plaintexts. 

2. If d > 2, the basic differential attack, where an attacker is restricted to 

choose the values in the differentials before the attack. 

3. If d > 2, the basic linear attack. 
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The point we make is that restricting the attacker to such basic attacks does not 
lead to a strong proof of security. More specifically, we showed how some more 
advanced differential techniques can be used to attack decorrelated ciphers in 
general and reduced versions of DFC in particular. Although the decorrelation 
theory may be a valuable contribution to cryptographic research, it does not 
guarantee resistance against state-of-the-art differential attacks. 

Acknowledgments 

The authors thank Serge Vaudenay and the anonymous referees of the pro- 
gramme committee for helpful comments. 



References 

1. J. Borst, L.R. Knudsen, V. Rijmen, “Two attacks on reduced IDEA,” Advances in 
Cryptology, Proceedings Euroerypt ’97, LNCS 1233, W. Fumy, Ed., Springer- Verlag, 
1997, pp. 1-13. 

2. H. Gilbert, M. Girault, P. Hoogvorst, F. Noilhan, T. Pornin, G. Poupard, J, Stern, 
S. Vaudenay, “Decorrelated fast cipher: an AES candidate,” Technical report, avai- 
lable from http : //www/ens . fr/~vaudenay/df c .html. Submitted as an AES can- 
didate. See also http://www.nist.gov/aes/. 

3. L.R. Knudsen. DEAL - a 128-bit block cipher. Technical Report 151, Department 
of Informatics, University of Bergen, Norway, February 1998. Submitted as an AES 
candidate. See also http://www.nist.gov/aes/. 

4. X. Lai, J.L. Massey, and S. Murphy. Markov ciphers and differential cryptanalysis. 
In D.W. Davies, editor. Advances in Cryptology - EUROCRYPT’91, LNCS 547, 
pages 17-38. Springer Verlag, 1992. 

5. NIST’s AES homepage, http://www.nist.gov/aes. 

6. S. Vaudenay, “Feistel ciphers with L 2 -decorrelation,” Preproceedings of SAC’98, 
August ’98, Kingston (Canada). 

7. S. Vaudenay. “Provable Security for Block Ciphers by Decorrelation,” In STACS’98, 
Paris, France, LNCS 1373, Springer- Verlag, 1998, pp. 249-275. 

8. D. Wagner. The boomerang attack. In these proceedings. 




Scramble All, Encrypt Small 



Markus Jakobsson * Julien P. Stern ** Moti Yung *** 



Abstract. In this paper, we propose a new design tool for “block en- 
cryption” , allowing the en/decryption of arbitrarily long messages, but 
performing en/decryption on only a single block (e.g., 128 bit block), 
where the rest of the message is only processed by a good scrambling 
function (e.g., one based on an ideal hash function). The design can be a 
component in constructing various schemes where the above properties 
gives an advantage. A quite natural use of our scheme is for remotely 
keyed eneryption. We actually solve an open problem (at least in the re- 
laxed ideal hash model and where hosts are allowed to add randomness 
and integrity checks, thus giving a length increasing function); namely, 
we show the existence of a secure remotely keyed encryption scheme 
which performs only one interaction with the smart-card device. 



1 Introduction 

We provide a basic design allowing encryption and decryption to be performed 
by combining a high quality scrambling technique with a strong encryption/ 
decryption mechanism. In particular the method applies to cooperation between 
an untrusted (potentially exposed) but computationally potent device and a 
trusted (highly secure) but computationally weak device. The method achieves 
an appropriate balance of computation and trust by employing a scrambling 
mechanism, which is modeled as an ideal length preserving one-way function 
which can be built from a regular ideal (random oracle like) hash function. 

While following the usual Feistel structure for block cipher design, we show 
that most of the work can be done by the (publicly known and invertible) scram- 
bling, requiring only a small portion to be performed by encryption. This situa- 
tion allows a design which can be useful when the encryption is more expensive 
than the scrambling, or when it is otherwise limited. This is the case when the 
encryption is done by a small device (e.g. a smart card) which is slower than a 
general host’s software (which has more computational power, but is less secure 
- thus encryption keys must never be exposed to the host). We call the design 
“scramble all, encrypt small.” 

A most natural application of our protocol is for “remotely keyed encryp- 
tion”, which we motivate below. Yet, our protocol is of independent interest 
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and can be used in other settings, for example to speed up certain encryption 
processes. In particular, our construction nicely applies in the context of traitor 
tracing [CFN94, Pfi96, NP98]. In traitor tracing, the data is encrypted with a 
random key, which is itself encrypted so that only users with legitimate keys can 
decrypt it (so the keys can be traced). Instead of encrypting the data with a 
random key, we can simply scramble it and encrypt a single block in the same 
way the random key is encrypted. 

The need for remotely keyed encryption was nicely motivated by Blaze. Let 
us recall the motivation. Today’s relatively open environment of hosts (e.g., In- 
ternet or Intranet servers) leads to a rather paradoxical situation in terms of 
security architecture: The hosts which are the gateways to the outside world (or 
internal users) are frequently required to perform cryptographic tasks, simply 
because they often use encryption to communicate at a low level, and as such, 
are the most vulnerable to various attacks (or even to exposure by simple im- 
plementation flaws). Given a cryptographic application or service, what is often 
neglected is its embedding environment and the implementation. It is argued 
that given that general software and the operating system running the software 
may be susceptible to viruses, Trojan horses, and other related attacks, there is a 
substantial risk of attackers momentarily gaining control of networked computers 
and the information stored by these. Therefore, secret keys should not be kept 
or used in this environment whenever possible. On the other hand, smart cards 
and other computational platforms with reduced risks of attacks often have a 
very limited computational ability and low storage capabilities. Consequently, it 
is desirable to divide the computation between an untrusted but powerful device 
(which we will call the slave) and a trusted but weak device (the master). 

This problem, which in its encryption incarnation is known as remotely keyed 
encryption schemes (RKES) was proposed in [Bla96] with no model or proof and 
with certain subtle problems. Several solutions have already been suggested to 
solve this problem [Bla96, Luc97, BFN98]. If we momentarily disregard the secu- 
rity issues, the common aspect of these three schemes is that they ask a smartcard 
to generate a “temporary” key, which depends on the message, and which is used 
to encrypt the largest part of the message. This generates a “binding” between 
the message and the device via the encryption. They later ask to hide this key. 
A formal model and secure solution for this (based on pseudorandom functions) 
was given in [BFN98]; their solution requires two accesses to the smartcard per 
operation. 

While our system provides the same functionality, the use of the “scramble 
all, encrypt small” notion is different: in our scheme, the host does not perform 
any encryption. It simply scrambles the message in a publicly available (and in- 
vertible) way (after adding randomness and integrity check), and then deprives 
an adversary of the ability to invert the scrambling by letting the smartcard 
encrypt just a small part of the scrambled message. Also, in our scheme, there 
is a single access to the trusted smart card which performs a single decryp- 
tion/encryption operation before replying. This is implied by the fact that the 
host does not use a temporary key for encryption and does not encrypt at all. 
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but rather the host employs scrambling of high quality. In some sense the ear- 
lier constructions can be viewed as an unbalanced Feistel (Luby-Rackoff type) 
encryption. Namely, pseudorandom permutation based on pseudorandom func- 
tions, where the pseudorandom functions are on the smart card. This seems to 
require two applications of the pseudorandom function. This implies (only intu- 
itively) the need for two calls to the card. Thus, to get the result with “a single 
call” to the smart card we rely on the ideal (random oracle) hash model. It is an 
open question whether this stronger assumption model is necessary for achieving 
the “single call” result. 

Since the host does not use encryption, the method gives a way to encrypt 
arbitrarily long messages (with strong notion of security), while encrypting only 
one block (of size, say 128/ 256 bits). We also allow the host to add randomness to 
the encryption (this can be viewed as an added random IV) so we can formalize 
our mechanism and security w.r.t. an overall randomized encryptions (which 
eases the difficulty of pseudorandom ones as in the earlier works) . The host also 
adds an integrity check to prevent message modifications and other attacks. 

We validate our design by giving a proof of security assuming the scram- 
bling is an ideal hash and the small encryption is an ideal block cipher (random 
permutation) . 

From an engineering perspective, the Bear and Lion designs [AB96] have 
taught how to encrypt an arbitrary long block given fast stream ciphers. This 
is suitable for applications with large messages. While we have the same ap- 
plications in mind, we teach how to minimize the encryption mechanism to a 
small block of size 128 bits, say. This minimization may be due to configuration 
and performance constraints. Thus, our design where only small portions are 
encrypted (rather than the entire large block which here is only scrambled with- 
out a key) may be called “Cub” as opposed to “Bear” and “Lion”. Note that we 
do not claim that there is a universal performance advantage of Cub over the 
earlier designs which applies to all working environments. On the contrary, we 
understand very well that stream cipher encryption may be fast comparing to 
a good scrambling mechanisms. What is more interesting is the minimization of 
encryption itself and the confinement of the entire cryptographic operation. 

Finally, we remark that since the design deals with the constraints of having 
a slow encryption whose use is minimized, it may fit well when the encryption 
employed is obviously much slower than available good scrambling techniques 
based on cryptographic hash functions. If the available encryption is based on 
public-key, the relative performance resembles the gap of “remotely keyed en- 
cryption” . 

Remark on presentation: While we present a general two-process method, 
its implementation as a “remotely keyed” mechanism is the prime example and 
will serve as the working example (since it is concrete and it further adds envi- 
ronmental constraints). 

Outline: Section 2 discuss previous work on the subject. Section 3 introduces 
our model, and section 4 presents the basic tools we will use. Section 5 explains 
our new solution. Validation of the design via a security proof is given in section 6 
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and experimental results are outlined in section 7. Finally section 8 presents open 
problems and concludes the work. 

2 Related work 

There are many issues in block cipher design (for extensive review see the appro- 
priate sections in [Sch96, MOV97]). Feistel introduced the fundamental notion 
of rounds where a one-way function of half the “current message” is exor-ed into 
the other half. Provable constructions based on one-way functions that act glob- 
ally on half the block (analogous to “large s-boxes”) were first investigated by 
Luby and Rackoff [LR88]; their functions were based, in fact, on pseudorandom 
functions. Many other investigations followed which included simplifications of 
the analysis and the proof of security and variations/ simplifications of the round 
functions (see [Mau92, Luc96, NR97]). 

In particular, employing stream-ciphers for fast block message encryption 
in the style of Luby Rackoff was presented in [AB96]. Their design like the 
one in this work is suitable for block cipher operations for large blocks, which 
may be typical in software oriented mechanisms. Adding public scrambling via 
hashing prior to encryption was considered in the past. A prime example is the 
scrambling using ideal hash and pseudorandom generator in [BR94]. Another 
example is in a work (coauthored by the third author) [IBM-pa.] . However, none 
of these works put forth the notion of global scrambling combined with local 
(small block) encryption as a possible provable design. Related work which we 
have learned about recently, is in [MPR97, MPRZ98] . They do suggest a design 
where scrambling via hashing is done prior to partial encryption as in our case, 
but they do not give a security validation proof. They suggested their design to 
standardization committees, and our validation proof may support their effort. 
Their work does not have the context or the consideration for remotely keyed en- 
cryption; rather they suggest it in the context of public-key encryption. The goal 
of the current work is to suggest minimized encryption which is validated and 
can be used systematically whenever needed, possible or useful. We character- 
ize scenarios (alternatives) in working environments where we get performance 
advantage. 

Blaze’s remotely key encryption protocol [Bla96] was based on the idea of 
letting the host send the card one block which depends on the whole message. 
This block would be encrypted with the smart card secret key, and would also 
serve as a “seed” for the creation of a “temporary” secret key, that will be used by 
the host in order to encrypt the rest of the message. However, as Lucks [Luc97] 
pointed out. Blaze’s scheme had problems in that in allowed an adversary to 
forge a new valid plaintext/ciphertext pair after several interactions with the 
card. Lucks suggested an alternative model and protocol, which in turn, was 
attacked by Blaze, Feigenbaum and Naor [BFN98] who further demonstrated 
the subtleties of this problem. They showed that the encryption key used for 
the largest part of the message is deterministically derived from the two first 
blocks of the message, hence an adversary who takes control of the host will 
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be able to decrypt a large set of messages with only one query. They derived 
a careful formal model and a scheme based on pseudorandom functions. Very 
recently, Lucks [Luc99] further extended their security model and suggested a 
faster scheme. In contrast, we will allow randomized encryption and we will rely 
on ideal (random oracle) hash functions. 

Our work makes sure that missing a small piece of the scrambled data (via 
encryption) while keeping the rest available makes it hard to recover the mes- 
sage. This bears some relationship to Rivest’s notion of All-Or-Nothing encryp- 
tion [Riv97]. Informally, given a symmetric cipher, a mode of encryption is de- 
fined to be strongly non-separable when it encrypts s blocks of plaintext into 
t blocks of ciphertext {t > s) which are such that an attacker cannot get any 
information on the plaintext without decrypting all the ciphertext blocks. In 
order to obtain strongly non-separable encryption, Rivest suggests to perform 
an “all-or-nothing” transform on the plaintext, followed by a regular encryption 
of the result. There is an obvious parallel between our scrambling step and the 
all-or-nothing transform. As a matter of fact, our scrambling step possesses a 
slightly more general property than the strong non-separability. It has the prop- 
erty that no information can be gained on the pre-image of a scrambling as soon 
as any k-bits are missing (in the design we fix which k bits to hide). The two 
preprocessing steps are interchangeable: Rivest’s all-or-nothing transform could 
be followed by a single encryption, and our scrambling yields an all-or-nothing 
encryption mode. However, the motivations of the two notions are very differ- 
ent. Our goal is to design a scheme which minimizes encryption with a given 
key, while Rivest’s goal is to make brute-force decryption more difficult to an 
adversary. As a final note, we should point out that our techniques is much 
more efficient than the one proposed in [Riv97], notably because we do not use 
encryption during the preprocessing step. 



3 Model and Definitions 

We will present a two-stage model: scrambling and encryption. As noted above, 
the presentation follows the remotely keyed encryption model. 

3.1 Model 

Our scheme involves two connected devices: a computationally potent device, let 
us call this device the slave device, and a computationally weak device, which 
we denote the master device. 

We assume a limited and low bandwidth communication channel between 
the master and the slave. We trust the master device to follow the protocols and 
to be tamper resistant. We only trust the slave device to he able to perform the 
operations it is asked to. On the other hand, we do not trust it for being intrusion 
resistant: we assume that an adversary may take full control of it for some time 
period. During this preliminary period the adversary may in particular obtain 
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any information that the slave has and also interact with the master in any way 
he likes. 

We would like to construct schemes that allow the slave device to perform 
encryption (and decryption) on large messages at high speed, with the help of 
private information owned by the master device. In the full generality, we may 
consider that the master can own keys corresponding to any kind of encryption 
(symmetric, public-key, probabilistic or non-probabilistic), and that the encryp- 
tion obtained at the end of the protocol can be of any of the previous types. 
However, for public-key we may employ the master only for decryption, and 
with symmetric encryption the type of queries and attacks is larger (since “en- 
cryption queries” are meaningful). We consider as our example model of choice 
a master which employs an ideal cipher (e.g., a (pseudo)-random permutation), 
and a slave which performs sampling of a randomized IV, and scrambling via a 
public ideal (random oracle) hash. 

The requirements on the schemes are as follow: 

Balanced computation. The slave should perform the largest possible part 
of the computation, the master should perform the lowest. Their respective 
parts of the computation can be proportional to their computing rate (thus 
we balance the time spent by each component). Other choices of resource 
balancing are possible- e.g., limit the slow component to a “constant usage” 
and vary the fast component as a function of the message size. 

Low communications. The number of interactions between the slave and the 
master should be low, and each of these interactions should need only a small 
amount of communication. Ideally, there should be only one interaction per 
protocol. 

Security. Intuitively and informally, we require that after having taken control 
of the slave and making a number of queries (bounded by some polynomial) 
to the master, and then losing the control of the slave, an adversary will not 
have any advantage in distinguishing subsequent plaintext/ciphertext pairs 
from random pairs. Variations on the security requirements are possible: e.g., 
the adversary may choose the plaintext for which a distinguishing challenge is 
required. Other challenges than distinguishability are possible as well. Here, 
we will consider two attacks. (Though the above talks about “polynomial 
time” and “advantages” in general terms, we will actually compute actual 
probabilities of successful attacks). 

Of course, the interaction and the encryption blocks are still large enough in 
the size of the security parameter. Namely, the security of the protocol is assured 
sub-exponentially in the size of that block (e.g., 128 or 256 bit size). 

3.2 Definitions 

Configuration A probabilistic remotely key encryption scheme (PRKES), con- 
sists of two protocols, one for encryption, the other for decryption, both executed 
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on two communicating devices. These devices are (1) a probabilistic machine 
(machine with a truly random generator) called a “master” and (2) a machine 
called a “slave”. (The slave can also be probabilistic in some designs. In fact, 
herein we model the slave as an ideal cipher, namely a random permutation). 
Given an input to the slave, it interacts with the master and produces an output. 
The input includes an indicator for performing either encryption or decryption, 
possibly a size parameter, and the argument (resp. cleartext or ciphertext). 



Background for attacks Let us review attacks on the protocol. 

The polynomial-time attacker A (we will use concrete assumptions regarding 
the “polynomial-time power” ) has a challenge phase, and one or more probing 
(tampering) phases. Typically the probing enables the adversary to activate the 
device up to some bound (some polynomial in the key length). The probing is 
a preliminary step prior to the challenge (or a prior step followed by additional 
probing with certain restriction after the challenge has been issued). 

In the challenge phase, a certain goal is required from A: 

Distinguishability challenge: A is presented with a challenge pair Ci, C 2 , which 
is either a plaintext/ciphertext pair or a random pair from the same distri- 
bution. (Below we specify one such distribution). 

Valid pair creation challenge: Another possible type of challenge, yielding 
another attack, is to ask A to exhibit one plaintext/ciphertext pair more 
than the number of probes he performed. 

The probing phase. As noted above it can come before the challenge but also 
some limited part of it can be allowed to occur after the challenge. 

We consider two probing phases regarding PRKES: 

System probing: The attacker gets input-output access to the slave but not 
to its memory. Namely, he uses the slave as an oracle. 

Master probing: The attacker gets full access to the memory of the slave. He 
can use the slave to interact with the master (input-output access) on queries 
of its own using the master as an oracle. 



Attacks and security against them Let us next describe the entire attack. 
We consider two types of attack, which differs only by the challenge phase, 
described below. 

An adaptive chosen-message chosen- ciphertext attack with distinguishability 
challenge includes: 

— first phase: A performs “master probing” . We assume that it performs up to 
Pi pre-challenge probes. We also assume that the slave is reset (to a random 
state) after the intrusion, that is before the second phase. (This insures that 
no state information on the slave is known after the probing). 
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— second phase (distinguishability challenge) : A presents a plaintext (or plain- 
texts) pi and gets a challenge pair (plaintext, value) whose plaintext part 
is pi and the value part is either a ciphertext of the plaintext or a random 
value, each case chosen with probability 1/2. 

— third phase: A performs “system probing” where he can ask any oracle query 
(including further encryptions of the challenge plaintext) BUT is not allowed 
to perform a decryption query on the ciphertext of the challenge pair. We 
allow up to p 2 such post-challenge probes. 

Security: We say that a PRKES is secure against a distinguishability attack if 
for a plaintext of its choice, A cannot distinguish a valid {plaintext /ciphertext} 
pair from a {plaintext/random-text} pair with probability asymptotically better 
than 1/2 (for the same chosen plaintext). 

An adaptive chosen-message chosen-ciphertext attack with valid pair creation 
challenge includes: 

— first phase: A performs “master probing”. We assume that it performs up to 
Pi pre-challenge probes. We also assume that the slave is reset (to a random 
state) after the intrusion, that is before the second phase. (This insures that 
no state information on the slave is known after the probing). 

— second phase (valid pair creation challenge): A is challenged to exhibit pi -I- 1 
valid plaintext/ciphertext pairs. 

Security: We say that a PRKES is secure against a valid pair creation attack if 
A is able to answer the challenge only with an asymptotically small probability 
(to be computed concretely). 

Remark: It should be noted here that our definition is different from the 
basic definition in [BFN98]. Here the definition follows the one in (adaptive) 
chosen ciphertext security (See [NY90, RS92, DDN91, BDPR98]). This is due to 
a difference in the model. In [BFN98], length preserving encryption was consid- 
ered. As a consequence, their encryption model was deterministic, and thus, their 
definition required the introduction of an arbiter to filter the choice of A in the 
second phase (whereas in our case, internal randomization may allow oracle style 
probing on the challenge). In a recent extended version of their paper [BFN99], 
a formal model and treatment of length increasing functions was given as well; 
it formalized an indistinguishability attack. Our indistinguishability attack is of 
a similar nature. 



4 Basic Tools 

Ideal hash function We assume the existence of an ideal hash function (a 
function whose behavior is indistinguishable from a random oracle, see [BR93]). 
In numerous practical constructions, the validation or indication of the security 
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of the design was based on such an assumption; we use the assumption in the 
same manner here. In practice, the ideal hash function may be replaced by a 
strong hash (such as one based on SHA-1). Then from this hash function, we 
show how to construct, in a very simple manner, an ideal length-preserving one- 
way function H which - apart from the fact that its output has the same length 
as its input - has the same properties as an ideal hash function. 

Let h be an ideal hash function and I be the size of the hash produced by 
this function. Let a; be a message of size n. We assume, for ease of explanation, 
that I divides n. 

We define Hi as Hi{x) = h{t\\i\\h{x)) (where || denotes the concatenation 
and t is a tag designated specifically for this usage of the hash function, and 
which can include a specific number and the length of x). 

Then, we can define H{x) as being the concatenation of the Hi{x) so that 
the size of H{x) matches the size of x: 

F(x) = Ho||ifi||...||iL„/p_i) 

If I does not divide n, we simply concatenate with the first bits of an extra 
iLi+(„/(fe_i)), as to ensure that the sizes match. 

It is easy to argue that since each sub-block of H depends on the whole 
message, that if h is ideal in the random oracle model, H is ideal as well. (In short, 
this is so since any weakness (say, a bias from randomness or predictability) with 
H can translate to a weakness on one of the blocks and thus to a weakness with 
h). We comment that if (unlike the construction above) each block of H is not 
global (sensitive to all bits) there may be problems. These problems arise either 
from the definition of ideal hash function and also certain concrete problems as 
pointed out at in [And95]. 

While the above construction allows us to build a pseudo-random stream 
of the same size of the message without relying on other assumptions than 
ideal-ness of the hash function, we can, for efficiency reasons, replace the above 
quadratic construction by faster ones. A single regular hashing of the message 
can be used as a seed for a pseudo-random number generator, which can be, 
for instance, the PRNG suggested in [AB96]. We can also employ the Panama 
cipher [DC98] (which combines hashing and stream cipher). 



Encryption function We now discuss the properties that we require for the 
encryption function used by the master (smartcard). We require the encryption 
to be at least pseudorandom (a pseudorandom permutation or function). This 
assures “strong security” in the sense that no partial information is revealed 
about the input given the output (and vice versa). It protects as in semantic 
security except for the “block repeat problem” where the encryption of the same 
block is deterministic. If we add a random IV we get rid of this “issue” as well. 

Denote by kin the input size of the cipher, and by kout its output size. For 
symmetric ciphers, without IV we have km = kout- Indeed this may be a pseu- 
dorandom function (such as encrypting a block in an EBC or CBC mode; or we 
may allow adding IV increasing the input size), km = k is our security parameter 
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(namely we will achieve security while allowing polynomial probes in its size) 
and we assume that the encryption above is invertible with probability 
for some constant <5 (to an adversary which gets to see a concrete given bound 
of cleartext ciphertext pairs). When validating our design we will assume the 
encryption to be an ideal cipher, namely a random permutation (so that the 
probability of inverting above can be easily related to the number of cleartext 
ciphertext pairs available to a concrete attacker) . 

5 The Scheme 

Let n be an integer, which represents the size of a message to be encrypted, and 
let M G {0, 1}" be this message. Also let fc be a security parameter. We will 
denote by x the secret key held by the card, and by E(-) and D(-) the encryption 
and decryption functions used by the card. Finally, we will denote by iL a length 
preserving hash function, as defined in the previous section. 

We first present the encryption scheme: 

Encryption 




Fig. 1. The Encryption Algorithm 



The encryption protocol is sketched in figure 1 and precisely goes as follows: 
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Slave computation Let M be the plaintext to be encrypted. The slave first 
chooses an uniformly chosen random number U of size polynomial in the 
security parameter (say kin bits without loss of generality) and pads M on 
the left with R = U\\h{M, U). 

Then, if the resulting string is of odd size, an additional padding of one 
random bit is performed (this can be dealt with by the definition of the hash 
function h). Remark: we assume the length of the input is known and fixed, 
otherwise a byte with the actual length of the randomness can be appended 
as well; this also enables dealing with encryption of small messages. 

Then, the resulting string (M) is split in two equal size parts, Ma, Mb {M = 
(Ma\\Mb)), and the slave computes: 



Cb = Mb® H{Ma) 

Ca = Ma® H{Cb) 

Master computation The slave extracts the kin last bits of Cb (say S) and 
asks the master to encrypt it. The master computes the encryption T of S' 
(T = E{S)) and sends it back. This ciphertext is kout bits long. 

Final slave operation The slave finally erases the kin last bits of Cb and re- 
places them by T. 

Decryption 

The decryption protocol is as follows: 

Master computation Let C be the ciphertext to be decrypted. C is split into 
two equal parts (assuming the master’s encryption is non-expanding), Ca, Cb 
{Y = {Ca\\Cb)). The slave extracts the kout last bits of Cb (say T) and asks 
the master to decrypt this string. The master computes a decryption S' of T 
(S = D{T)) and sends it back. 

Slave computation The host replaces the kout last bits of Cb by S to obtain 
Cb {kin = kout, if not we simply adapt the lengths and the halving of the 
intermediate ciphertext). The slave computes: 



Ma = Ca® H{Cb) 

Mb = Cb® H{Mu) 

The slave finally recovers the initial random IV at the beginning of Ma and 
recovers M from the two parts. He checks whether the random IV part is 
correct (that is if the random IV is of the form U\\h{M^ 17)), and if so returns 
the message M, else it outputs “error” . 
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6 Security 

We will next sketch the proof showing that our scheme is secure against an 
adaptive chosen-message chosen-ciphertext attack. This will validate the design 
as secure as long as the scrambling function is ideal (i.e., polynomially indis- 
tinguishable from a random oracle or intuitively not easily distinguishable from 
a large random table) and the master’s encryption is not easily distinguishable 
from an ideal encryption. (Based on the ideal hash and encryption we can cal- 
culate the attack success by calculating probability of certain detection event 
performed by the attacker, such as the detection of a valid ciphertext in the 
challenge) . 

We will analyze both attacks (challenges): valid pair creation and indistin- 
guishability. 

Let us first analyze the master probing phase (which is performed in both 
attacks). The attacker has taken control over the slave and can query the mas- 
ter without any filtering, both for encryption and decryption. Hence, at the 
end of the master probing phase, he has potentially gathered a list of p\ plain- 
text/ciphertext pairs for the master’s inner encryption. 

6.1 Against the valid pair creation attack 

Informally, the core of the proof is to show that it will be possible to get in- 
formation on only one valid plaintext/ciphertext pair for the global encryption 
out of a valid plaintext/ciphertext pair for the master encryption, (except with 
negligible probability) . 

With this type of challenge, there is no additional slave probing phase, as we 
can assume that everything potentially done during the slave probing phase can 
be done during the master probing phase. 

Assume that the attacker A, after performing pi queries to the master, is 
able to produce pi + 1 valid plaintext/ciphertext pairs for the whole encryption. 

Call S the part to be given as input to the master (that is, with the notations 
of section 5, the kin last bits of iL(i?||Mi)0M2). Then, we have two cases: either 
for all Pi -I- 1 pairs, S is different, or it is the same for at least two of them. 

In the first case, A can be straightforwardly adapted to build an attacker 
which breaks the master encryption, as it exhibits pi -1-1 valid encryption/decryption 
pairs for the master after only pi queries. We assumed the master’s encryption 
can only be inverted with probability for some small constant 6, hence in 

the first case, the attacker can fulfill the challenge only with probability 
(in the ideal cipher case this probability is easily derived from the number of 
encryption probes). 

In the second case, if two master’s inputs are the same, it means that A 
was able to find two distinct pairs (U,M), such that the two values 

H((7||/i(M,[/)||Mi) © M 2 and H{U'\\h{M' ,U')\\ m() © M' match on their hn 
last bits. 
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Let US analyze the probability that such a collision occurs. The above can be 
rewritten: 



H{U\\h{M, U)\\Mi) ® H{U'\\h{M\U')\\M[) = M2 © M' 

In order to be able to find such a relation, the hash function H has to be 
evaluated on its inputs (because of the random oracle model). This means that 
M2 and M'2 have to be fixed prior to the evaluation of the left part. So, M2 and 
M2 being fixed, A was in fact able to find two couples (C/, Mi) and (U',M[) 
such that M(/7||/i(M,[/)||Mi) © M([/'||/i(M',C/')||M{) match on their last 
bits with a given constant. 

Now, we consider two sub-cases, whether h{M, U) and h{M', IT) are equal 
or not. If they are not equal, the attack performed in simply a birthday attack 
on a part of the hash function H, whose probability of success (based on the 

^in 

assumption on H) is 2“ “2“ . If they are equal (which can only happen if M2 and 
M2 match on their km last bits), then A was able to find a collision on the hash 
function, (which occurs only with probability 2“^'"'*' for some constant 7 given 
the access limitation of the adversary) . 

Finally, (under concrete limited access of the adversary to the hash, and 
encryption boxes, formalized as bound on the success probability), will be able 
to answer the challenge with probability bounded by: 

2~k.^s 2-'=-/2 4 . 

This is a negligible and controllable concrete probability of successful attack. 

6.2 Against the distinguishability attack 

Note that, for the distinguishability attack, the attacker can always guess the 
challenge and due to the challenge being random it has probability 1/2 of being 
correct. The only other way for the attacker is to probe and perform encryptions 
and decryptions and match them against the challenge. 

In this type of attack, A first performs pi master probes. Then, in the second 
phase, the attacker has lost control of the slave. He is presented a challenge pair 
(of which he can choose the plaintext which we call the target plaintext.) His goal 
is to guess with probability substantially larger than 1/2 whether the challenge 
pair represents a valid encryption pair (i.e. whether the ciphertext is a correct 
encryption of the plaintext). 

We first consider the security without the third slave probing phase. We will 
show afterwards that this phase does not help the attacker. 

Again, we have two cases: either the kout last bits of the ciphertext match a 
ciphertext from the master ciphertext list or not. In the first case (which happen 
with probability Pi 2“^°“*), A will be able to decrypt and to answer the challenge. 

In the second case, in order to be able to answer correctly to the challenge, 
he as to be able to find out whether the following relations hold: 
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(Cb(BMb = H{Ma) 

\Ca®Ma = H{Cb) 

However, he has no information about the ki„ last bits of Cb (but with 
probability if he successfully inverts the master’s encryption), and he 

has no information on the kin first bits of Ma (because the slave was reset after 
the intrusion, and thus he cannot predict the value or the random number chosen 
by the slave). 

So, in order to be able to check with some advantage if the above relations 
hold, the attacker has to guess the value of H on Ma (to validate the first eq. 
based on partial availability of Cb) or the value of H on Cb (to validate via the 
second eq.). This can be done with probability 2“^*" for each (since only if the 
right string is guessed it is validated via the random oracle hash). 

Hence, with only the initial master probing phase, the probability that A can 
correctly answer the challenge is 1/2 + 2“**"’^ +Pi2~*°“‘ + 2“^™+^. 

Now, let us analyze what the attacker can do after the last slave probing 
phase. 

Recall, that A is allowed to perform p 2 slave probes. He is allowed to query the 
encryption of any message, and the decryption of any message but the ciphertext 
challenge. 

Of course, we assume that the ciphertext challenge has not already been 
decrypted, that is the kout last bits of the challenge do not appear in any of the 
Pi pairs previously obtained. 

If he queries for encryption on a different plaintext than the challenge plain- 
text, he will not get any information (at best, he will obtain a ciphertext which 
he can decrypt (with the help of the p\ pairs previously obtained), and will be 
able to retrieve the random IV of the slave). 

If he queries for encryption on the target plaintext, then he will obtain infor- 
mation on the challenge only if he actually obtains the target ciphertext (which 
happens if the random IV (which is totally under the control of the slave now) 
is the same as in the challenge, which happens with probability 2“^*". 

If he queries for decryption, the best he can try is to query for a decryption 
of a ciphertext which matches the challenge on its kout last bits. In this case, he 
will be able to find the decryption of the master’s encryption on these bits if he 
can guess the random IV of the slave, that is, again, with probability 2“^*". 

Finally, the probability that he can answer the challenge is: 

1 /2 -f 2-'=™'^ -k pi2-'”°'“ -k -k P22-'”‘" 

This is, again, a controllable probability (due to the actual limitations of the 
adversary) that can be made as close to 1/2 as we want as we allow probing, by 
proper choice of km, kout- 




7 Experimental results 
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We now briefly comments on the efflciency of our scheme within the suggested 
context of hardware assisted encryption. We implemented it on a personal com- 
puter (a PentiumPro(tm) 200). We chose our security parameters {kin and kout) 
to be 128 bits long. Times for the scrambling part are summarized in the table 
below. 



length (bits) 


1024 


2048 


4096 


8192 


16384 


32768 


65536 


time (ms) 


0.6 


1.0 


1.8 


3.6 


7.1 


14.2 


28.3 



These speeds are not at first very impressive, but they have to be compared 
to a case where the master device (typically a smartcard) performs all the com- 
putation alone. Taking the DES as an example, implementations on smartcards 
theoretically range from 5 ms to 20 ms per block, but are often in the vicinity of 
40 ms in order to prevent differential fault analysis [Nac]. Taking 40 ms for two 
DES blocks as an optimistic average value, we summarize the encryption speed 
of a smartcard based DES for messages of the same size as above. 



length (bits) 


1024 


2048 


4096 


8192 


65536 


time (ms) 


320 


640 


1280 


2560 


20480 



Assuming that the encryption is pipelined, and choosing to encrypt 98304 = 
3.2^® bits blocks (which makes the master and the slave computation workload 
about equal), we can encrypt with our scheme at a rate of about 2460 kbits/s, 
which has to be compared to the 3.2 kbits/s rate of the smartcard alone. We 
comment that in the context of public-key encryption, assuming that a smartcard 
can perform an 1024 bit RSA encryption in about one second, our design allows 
the encryption in one second of a 300 kilobyte message with the smart card’s 
public-key. 



8 Conclusion 

We have presented the notion of “scramble all, encrypt small.” We showed how it 
fits within the “remotely keyed encryption.” The principle demonstrated by the 
notion can be employed elsewhere as a sub-structure, for example if the scram- 
bling is not “ideal” we may iterate the basic overall encryption many times 
(adding randomness only at the first iteration, or avoiding adding randomness). 
Also, we may first perform more “Feistel scrambling” rounds. These types of vari- 
ations and extensions and their impact on security (and its proof) under differ- 
ent assumptions about the scrambling component are interesting open questions. 
Variations on attacks, challenges and notions of security are also of interest. 
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Abstract. Remotely keyed encryption schemes (RKESs) support fast 
encryption and decryption using low-bandwidth devices, snch as secure 
smartcards. The long-lived secret keys never leave the smartcard, but 
most of the encryption is done on a fast untrusted device, such as the 
smartcard’s host. 

This paper describes an new scheme, the length-preserving “accelerated 
remotely keyed” (ARK) encryption scheme and, in a formal model, pro- 
vides a proof of security. For the sake of practical usability, our model 
avoids asymptotics. 

Blaze, Feigenbaum, and Naor gave a general definition for secure RKFSs 
0. Compared to their length-preserving scheme, the ARK scheme is 
more efficient bnt satisfies the same security requirements. 



1 Introduction 

A remotely keyed encryption scheme (RKES) distributes the computational bur- 
den for a block cipher with large blocks between two parties, a host and a card. 
We think of the host being a computer under the risk of being taken over by 
an adversary, while the card can be a (hopefully tamper-resistant) smartcard, 
used to protect the secret key. We do not consider attacks to break the tamper- 
resistance of the smartcards itself. The host knows plaintext and ciphertext, but 
only the card is trusted with the key. 

An RKES consists of two protocols, one for encryption and one for decryption. 
Given a /3-bit input, either to encrypt or to decrypt, such a protocol runs like 
this: The host sends a challenge value to the card, depending on the input, and 
the card replies a response value, depending on both the challenge value and 
the key. This exchange of values can be iterated. During one run of a protocol, 
every challenge value may depend on the input and the previously given response 
values, and the response values may depend on the key and the previous challenge 
values. (In this paper, we disregard probabilistic RKESs, where challenge and/or 
response values also may depend on random coin flips.) 
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1.1 History 

The notion of remotely keyed encryption is due to Blaze (SI- Lucks pointed 
out some weaknesses of Blaze’s scheme and gave formal requirements for the 
security of RKESs: 

(i) Forgery security: If the adversary has controlled the host for q— 1 interactions, 
she cannot produce q plaintext/ciphertext pairs. 

(ii) Inversion security: An adversary with (legitimate) access to encryption must 
not be able do decrypt and vice versa. 

(iii) Pseudorandomness: The encryption function should behave randomly, for 
someone neither having access to the card, nor knowing the secret key. 

While Requirements (i) and (ii) restrict the abilities of an adversary with access 
to the smartcard. Requirement (iii) is only valid for outsider adversaries, having 
no access to the card. If an adversary could compute forgeries or run inversion 
attacks, she could easily distinguish the encryption function from a random one. 

1.2 Pseudorandomness — Towards a Better Definition 

It is theoretically desirable that a cryptographic primitive always appears to 
behave randomly for everyone without access to the key. So why not require 
pseudorandomness with respect to insider adversaries? 

In any RKES, the amount of communication between the host and the card 
should be smaller than the input length, otherwise the card could just do the 
complete encryption on its own. Since (at least) a part of the input is not handled 
by the smartcard, and, for the same reasons, (at least) a part of the output is 
generated by the host, an insider adversary can easily decide that the output 
generated by herself is not random. 

Recently, Blaze, Feigenbaum, and Naor j2j found a better formalism to define 
the pseudorandomness of RKESs. Their idea is based on the adversary gaining 
direct access to the card for a certain amount of time, making qh interactions with 
the card. For the adversary having lost direct access to the card, the encryption 
function should behave randomly. An attack is divided into two phases: 

1. During the host phase (h-phase), the adversary is an insider, sends challenge 
values to the card and learns the card’s response values. She may run through 
the en- and the decryption protocol and may also deviate from the protocol 
(note though, that the card always interprets the “next value” it reads as 
the next challenge value, until the current protocol is finished). 

At the end of the h-phase, the adversary loses direct access to the card, i.e., 
is no longer an insider. 

2. In the distinguishing phase (d-phase), the adversary chooses texts and asks 
for their en- or decryptions. The answers to these queries are either chosen 
randomly, or by honestly en- or decrypting according to the RKES. 

The adversary’s task is to distinguish between the random case and honest 
encryption. 
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Consider an adversary having encrypted the plaintext P* and learned the corre- 
sponding ciphertext C* during the h-phase. If she could ask for the encryption of 
P* or the decryption of C* during the d-phase, her task would be quite easy. In 
the d-phase, we thus need to “filter” texts that appeared in the h-phase before. 
But since the adversary may deviate from the protocol, it is not easy to formally 
define which texts are to be filtered out. The authors of |5] require an arbiter 
algorithm B to sort out up to qu texts. This algorithm need not actually be 
implemented, it simply needs to exist. (The formal definition below looks quite 
complicated. Readers with few interest in formalisms should keep in mind that 
the arbiter B treats the special case that in the d-phase the adversary A asks for 
values already known from the h-phase. As will become clear below, the arbiter 
B for our scheme does exist and actually is quite simple.) 

Throughout this paper, “random” always means “according to the uniform 
probability distribution” . By a: © ?/ we denote the bit-wise XOR of x and y. 

After that much discussion, we give the formal definitions (which are not 
much different from the ones in j^). 



1.3 Definitions 

A (length-preserving) RKES is a pair of protocols, one for en- and one for decryp- 
tion, to be executed by a host and a card. The length of a ciphertext is the same 
as that of the corresponding plaintext. 

Let B be an algorithm, the “arbiter algorithm”, which is initialized with a 
transcript of the communication between host and card during the h-phase. 

During the host phase (h-phase), A may play the role of the host and exe- 
cute both the card’s protocols up to qh times, together. A may send challenge 
values to the card not generated according to the protocol and does learn the 
corresponding response values. 

During the distinguishing phase (d-phase), A chooses up to qd texts T as 
queries and asks for the corresponding en- or decryptions. 

W.l.o.g., we prohibit A to ask equivalent queries, i.e., to ask twice for the 
encryption of T, to ask twice for the decryption of T, or to ask once for the 
encryption of a T and some time before or after this for the decryption of the 
corresponding ciphertext. (Encrypting under a length-preserving RKES is a per- 
mutation, hence A doesn’t learn anything new from asking equivalent queries.) 

Before the d-phase starts, a switch S is randomly set either to 0 or to 1. If 
the arbiter B acts, A’s query is answered according to the RKES; B can act on 
most qh queries. 

Consider the queries B does not act on. The answers are generated depending 
on S. Consider A asking for the en- or decryption of a text T G {0,1}^ with 
/3 > a. If S' = 0, the response is evaluated according to the RKES. If S = 1, the 
response is a random value in {0, 1}^. 

At the end of the d-phase, A’s task is to guess S. A’s advantage adv^ is 



adv^ = I prob[“A outputs 1” | S = 1] — prob[“A outputs 1” | S = 0] 
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By Qh, we denote the number of interactions between the adversary A and the 
card during the h-phase, by qd we denote the number of queries A asks during 
the d-phase; q '■= qh + qd denotes the total query number. 

A RKES is {t, q, e)-secure, if there exists an arbiter algorithm B such that 
any t-time adversary A with a total query number of at most q has an advantage 
of at most e. 

1.4 Building Blocks and Security Assumptions 

In this section, we describe the building blocks we use for our scheme. As will be 
proven below, our scheme is secure if its building blocks are secure. Note that 
definitions of standard cryptographic and complexity theoretic terms are left out 
here; they can be found e.g. in j^. 

By a and b with 6 > a, we denote the blocksizes of our building blocks (while 
our scheme itself is able to encrypt blocks which may grow arbitrarily large). 
Note that a and b are important security parameters! We may use, say, 
a 64-bit block cipher such as triple DES as pseudorandom permutation, but this 
has significant consequences for the security of our scheme, even if the adversary 
cannot break triple DES. 

Our building blocks are 

— an a-bit blockcipher E (i.e., a family of pseudorandom permutations Ek 
over {0, 1}“), 

— a family of pseudorandom functions EV{0, 1}^ — >■ {0, 1}“ {F may be a 
6-bit blockcipher, if a < 6 we ignore the last b — a bits of the output), 

— a hash function El : {0, 1}* — > {0, 1}**, and 

— a length-preserving stream cipher S : {0, 1}* — >■ {0, 1}*, depending on 
an a-bit key. In practice, S may be an additive stream cipher, i.e., a 
pseudorandom bit generator where each bit of the output is XOR-ed 
with the plaintext (or ciphertext, if we think of S~^). Just as well, S 
may be designed from the block cipher E, using a standard chaining 
mode such as CBC. 

For the analysis, we assume our building blocks (such as block ciphers) to 
behave like their ideal counterparts (such as random permutations). This “ideal 
world” view allows us to define the resistance of our scheme against adversaries 
with unbound running time: 

A RKES is (g, e)-secure, if any adversary A with a query-complexity of 

at most q has an advantage of at most e. 

Consider our RKES being (g, e)-secure in the ideal world, but not (t, g, e-|-e)- 
secure in the real world. If either t is large enough to be infeasible or e > 0 is 
small enough to be negligible, the notion “(g, e)-secure” can still approximatively 
describe the scheme’s true security. Otherwise, we have found an attack on (at 
least) one of the underlying building blocks. Being (g, e)-secure for reasonable 
values of g and e implies that the construction itself is sound. 
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This is a standard argument for many cryptographic schemes, being compo- 
sed from other cryptographic schemes and “provably secure”. 

Our security assumptions are 

1. Ek is a random permutation over {0, 1}“, and for K ^ K' the permu- 
tations Ek and Ek' are independent. 

2. TV{0, 1}^ — >■ {0, 1}“, is a random function, i.e., a table of 2** random 
values in {0,1}“. Similarly to above, two random functions depending 
on independently chosen keys are assumed to be independent. 

3. is collision resistant, i.e., the adversary does not know and is unable to 
find a pair (V, V) G (0, 1}* with V and H{V) yf H{V) if V ^ V . 

4. Sk is a length-preserving stream cipher, depending on a key K G (0, 1}“. 
I.e., for every number n, every plaintext T G (0, 1}", every set of keys 
L = {Ki, . . . , Kj.} C {0, 1}“ and every key K G {0, 1}“, K ^ L, the 
value Sk{T) G (0, 1}" is a random value, independent of T, Ski (T), . . . , 
Sk (T). Similarly, the value S~^{T) is a random value, independent of 
SKliT),...,S],l{T). 

We do not specify the key sizes of E and E. We implicitly assume the security 
level of E and F (and thus their key size) to be long enough that breaking either 
of them is infeasible. 

In the world of complexity theoretical cryptography, the usage of asympto- 
tics is quite common. While this may simplify the analysis, it often makes the 
results less useful in practice. From a proof of security, the implementor of a 
cryptographic scheme may conclude the scheme to be secure if the security pa- 
rameters are chosen large enough - but such a result provides little help to find 
out how large is “large enough”. (Often, the implementor can find this out by 
very diligently reading and understanding the proof, though.) 

This paper avoids asymptotics. If we call an amount of time to be “infeasible” , 
we are talking about a fixed amount of computational time. What actually is 
considered infeasible depends on the implementors/users of the scheme and their 
threat model. Similarly, we use the word “negligible”. 

2 The ARK Encryption Scheme 

Using the above building blocks, we describe the accelerated remotely keyed 
(ARK) encryption scheme. For the description, we use two random permutati- 
ons Ei,E 2 over {0,1}“ and two random functions Fi,F 2 : {0,1}^ — ^ {0,1}“. 
In practice, these components are realized pseudorandomly, depending on four 
different keys. 

The encryption function takes any /3-bit plaintexts, encrypts it, and outputs 
a /3-bit ciphertext. The blocksize /3 can take any value (3 > a. 

We represent the plaintext by (P,Q) with P G {0,1}“ and Q G {0,1}^““; 
similarly we represent the ciphertext by (C,D) with C G {0,1}“ and D G 
{0, 1}^““. For the protocol description, we also consider intermediate values 
X,Z G {0, 1}^ and Y G {0, 1}“. The encryption protocol works as follows: 
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Fig. 1. The ARK encryption protocol. 



1. Given the plaintext (P, Q), the host sends P and X := H{Q) to the card. 

2. The card responds with Y := Pi(P) © Fi{X). 

3. The host computes D := Sy{Q)- 

4. The host sends Z := H{D) to the card. 

5. The card responds with C := E 2 {Y © F 2 {Z)). 

Decrypting {C, D) is done like this: 

1. The host sends C and Z = H{D) to the card. 

2. The card responds with Y = P^^(C) ® F 2 {Z). 

3. The host computes Q = Sy{D). 

4. The host sends X = Ft{Q) to the card. 

5. The card responds with P = F^^(Y © Fi{X)). 

Note that by first encrypting any plaintext (P, Q) under any key and then decryp- 
ting the result the ciphertext under the same key, one gets (P, Q) again. 

3 The Security of the ARK Scheme 

By Pi, Xi, Yi, Zi, and Ci we denote the the challenge and response values of 
the i-th protocol execution, which may be either a en- or a decryption protocol. 
The protocol can either be executed in the h-phase, indicated by i G {1, . . . , g;,}, 
or in the d-phase, indicated by i G {qh + 1, . . . , g}. A value Yi is “unique”, if 
Yi ^ {Yi, . . . , Yi_i, Yj+i, . . . , Yq}. The ARK schemes security greatly depends on 
the values Y^ in the d-phase being unique, except when B acts ( “in the d-phase” 
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indicates k > qh)- If the k-th en- or decryption query is answered according to 
the ARK scheme, and if Yfc is unique, then the answer is almost a random value. 

Theorem 1. For every number q and e = 1.5 * q'^j2°‘, the ARK scheme is a 
(q,e)-secure length-preserving RKES. 

Proof. For the proof, we first define the arbiter algorithm, and then we bound 
the advantage of the adversary. 

The arbiter algorithm B: 

For i G {1, . . . , qh}, the arbiter B compiles a list Li of all the pairs {Pi, Xi) and 
another list L2 of all pairs (Ci,Yi). These can be deduced from the transcript. 

If, in the d-phase, A asks for the encryption of a plaintext {Pj,Qj) with 
j G {ih + 1) • • • ) 9 }) the first challenge value for the card is the pair (Pj,Xj) with 
Xj = H{Qj). Only if the pair (Pj,Xj) is contained in the list Li, B acts on 
that query, and the answer is generated according to the encryption protocol. 
Similarly, if A asks for the decryption of a ciphertext {Cj,Dj) in the d-phase, 
and if the corresponding challenge value {Cj, Zj) is contained in L 2 , B acts and 
the answer is generated according to the decryption protocol. 

Now we argue, that B does not act on more than qh queries. Due to Assump- 
tion 3, i.e., due to the collision resistance of FI, the adversary A does not know 
more than one value Q with H{Q) = Xi. For the same reason, A does not know 
more than one value D with H{D) = Zi. Hence for every i G {1, . . . , qh}, A can 
ask no more than one plaintext {Pj,Qj) to be encrypted during the d-phase, 
where the corresponding pair {Pj,Xj) would be found in the list Li. Similarly, 
we argue for decryption queries. Finally, consider a plaintext T = {Pj,Qj) and 
the corresponding ciphertext T' = {Cj,Dj). Asking for the encryption of T is 
equivalent to asking for the decryption of T' , and we only regard non-equivalent 
queries. We observe: {{Pj,Fl{Qj)) is in the list Li) {{Cj, H{Dj)) is in L2). 

The advantage of A: 

In short, the remainder of the proof is as follows: 

Both for S' = 0 and S = 1, we define what it means for the the fc-th query 
{k G {qh -I- 1 . . . ,( 7 }) to be “bad^”. We define sets Uk and show that if 
query k is not bad^, the response is a uniformly distributed random 
value in Uk. Further, we evaluate the probability that the fc-th query is 
BADfc. We write bad* if any query k in the d-phase is bad^. If not bad*, 
then all the answers to A are randomly chosen according to a probability 
distribution induced by the sets Uk, and thus do not depend on S. This 
allows us to bound the advantage of A: 

advA < prob[BAD*|S = 0] -I- prob[BAD*|S = 1]. 

Let k > qh and A be asking for the encryption of a plaintext (Pk,Qk) G {0, 1}“ x 

{ 0 , 1 }^-“. 

We assume {Pk,Xk) ^ {{Pi, Xi), . . . ,{Pk-i, Xk-i)}. If j G {l,...,qh} and 
{Pj,Xj) = {Pk,Xk), then B acts and the answer to this query does not depend 
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on S. If j £ {gh + 1 ■ ■ ■ , k— 1}, then {Pj,Qj) ^ {Pk, Qk), because the j-th and the 
k-th query are not equivalent. If {Pj,Qj) yf {Pk,Qk), then (Pj,Xj) = (Pk,Xk) 
would indicate a collision Qj Qk for iJ, something we assume A can’t find. 

Depending on previous protocol executions and responses, we define the set 
Uk C {0, 1}“ X {0, 1}^““ of ciphertexts: 

Uk-.= ({0,ir-{Ci,...,Cfc_i}) X ({0,1}^-“). 

For S' = 1, the ciphertext (Ck,Dk) is a uniformly distributed random value 
in {0,1}“ X {0,1}^““. We define badj, Ck G (Ci, . . . , Cfc-ij. Obviously, if 
not BADfc, then (Cfe, Dk) is a uniformly distributed random value in Uk- Further: 

k — 1 

prob[BADfc|S = 1] < 

Now, we concentrate on S = 0. Here, we define 

BADfc (Yk G |Fi, . . . , Yk-i} or Ck G {Ci, . . . , Cfc_i|) 



Obviously, if not bad^, then (Ck,Dk) G Uk- Also, if not bad^, then Yk is not 
in {Yi, . . . , Yfc_ij, and then Sy^iQ) is ^ uniformly distributed random value in 
{0, 1}^““. Further, if Zj = Zk, then due to Yj y^ Yk, we have Cj yf Ck- Apart 
from this restriction, Ck is a uniformly distributed random value in {0, 1}“, and 
especially: if not bad^, the ciphertext (Ck,Dk) is uniformly distributed in Uk- 
If Xj y^ Xk, then Fi{Xj) and Fi{Xk) are two independent random values 
in {0,l}^ and so are Yj and Yk- If (Pk,Xk) ^ {{Pi,Xi), . - . ,{Pk-i,Xk-i)} for 
every j G (1, . . . , A: — 1}, we have Pj y^ Pk if Xj = Xk- In this case Yj y^ Yk- 
Hence prob[Yj- = Yk] < 2““. Similarly, we get prob[Cj = Ck\Yj y^ Yk] < 2““. 
This gives 

prob[BADfc|S = 0] < 2 



Thus: 



9 

prob[BAD*|S = 1] < ^ prob[BADfc|S = 1] < 

k=qh + l 



1 ^ 
2 2 “’ 



and 

9 „2 

prob[BAD*|S = 0] < ^ prob[BADfc|S = 0] < — . 

fc=9h + l 



Due to the symmetric construction of the ARK encryption protocol, the same 
argument applies if A, as the fc-th query, asks for the decryption of the ciphertext 
{Ck, Dk) instead of asking for an encryption. Hence, the advantage of A is 



, , 3 (7^ 

advyi < - * — . 

2 2 “ 



□ 
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4 The BFN Scheme [3J 

In 0, the Blaze, Feigenbaum, and Naor describe a length-preserving RKES 
which we shortly refer to as the “BFN-scheme” . As the ARK scheme is claimed 
to be accelerated, we need to compare the BFN scheme and the ARK scheme. 
Similarly to the ARK scheme, we represent the plaintext by (P, Q) with P G 
{0, 1}“ and Q G {0, 1}^““, and the ciphertext by (C, D) with C G {0, 1}“ and 
D G {0, 1}^““. Further, we consider X,Z G {0, 1}^ and I,J,Y G {0, 1}“. (Note 
that PI only considers b = a.) 



Card 



Q 




c 



D 



Fig. 2. The BFN encryption protocol 0. 



As building blocks, we need one random permutation Eq over {0, 1}“, three 
random functions Fi,F 2 : {0, 1}*' — ?> {0, 1}“, P3 : {0, 1}“ — >■ {0, 1}“, and a block 
cipher E (i.e., a family of random permutations) Ek over {0, 1}“, depending on 
a key K G {0, 1}^). The encryption protocol works as follows: 

1. Given the plaintext {P, Q), the host sends P and X := H{Q) to the card. 

2. The card computes X* = Fi{X), and uses X* as the key for E. 

3. The card computes I := Ex*{P)- 

4. The card computes J := Eq{I). 

5. The card responds Y := F 2 {I) to the host. 

6. The host computes D := Sy{C). 

7. The host sends Z := H{D) to the card. 

8. The card computes Z* = F^{Z), and uses Z* as the key for E. 
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9. The card responds C := Ez*(Y) to the host. 

Decrypting (C, D) is done the obvious way. 

Obviously, the ARK and the BFN scheme work quite similarly: 

1. The cryptographic calculations on the side of the host are exactly the same 
for both protocols. 

2. The communication between the host and the card is exactly the same for 
both protocols. 

3. Inside the card, the ARK scheme needs four evaluations of cryptographic 
functions such as Ei and Fj. (Also, it needs two bit-wise XOR operations.) 
In contrast to this, the BFN scheme needs six evaluations of cryptographic 
functions. 

4. Also inside the card, the ARK scheme allows the keys for the cryptogra- 
phic functions to be chosen once and for all. On the other hand, the BFN 
scheme requires two evaluations of the block cipher E, where the keys are 
dynamically chosen, depending on some of the challenge values. 

The third point indicates that, when implementing both schemes using the 
same building blocks, the BFN scheme’s smartcard operations should take 50 % 
more time than the ARK scheme’s smartcard operations. Due to the last point, 
things actually can be much worse for the BFN scheme. This greatly depends 
on the choice of the block cipher E and the key set-up time of E. 

5 Final Remarks 

5.1 The Importance of the Building Blocks 

It should be stressed that our proof of security is with respect to any attack an 
adversary can come up with. Often, proofs of security in cryptography only deal 
with specific attacks, such as differential or linear cryptanalysis. 

On the other hand, for practically using the ARK scheme, one has to in- 
stantiate the generic building blocks we are using. Our proof of security is only 
applicable, if all building blocks are secure, i.e., satisfy the security assumptions 
specified in this paper. 

Hence, ARK scheme implementors have the freedom of choice to select their 
building blocks - and take the responsibility for the building blocks to be secure. 
If the building blocks are secure, the scheme is secure, too. 



5.2 Inversion Security 

This paper’s reasoning is based on the security definition of Blaze, Feigenbaum 
and Naor j2j, which seems to be more suitable than the one of Lucks |SI. It 
requires RKESs to be pseudorandom with respect to ex-insiders. Consider the 
RKES E being pseudorandom with respect to ex-insiders. 
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Clearly, S is pseudorandom with respect to outsiders, too. Also, S is forgery 
secure. Otherwise, the adversary could execute the protocol 9 — 1 times in the h- 
phase to predict q plaintext/ciphertext pairs. In the d-phase, the adversary could 
ask for the encryptions of these q plaintexts, and compare the results with her 
own predictions. (Note that the arbiter B can act on at most q — 1 plaintexts.) 

But what about inversion security? Obviously, this property is quite desirable 
for some applications. Consider the RKES S*, a simple modification of E. Both 
the en- and the decryption protocol of E* start with an additional challenge 
value (3 £ {0,1}. The encryption protocol of E* goes like this: 

1. The host sends (3 to the card. 

2. li (3 = 0, both parties follow the encryption protocol of E. 

Else, both parties follow the decryption protocol of E. 

Similarly, we may define the decryption protocol of E* . Note that we did change 
the protocol, but the en- and decryption functions remain the same. Since the 
additional challenge value just allows the adversary another way to execute the 
decryption protocol, the security of E* and E is the same: 

Theorem 2. E is {q,e)-secure E* is {q,e)-secure. 

On the other hand, E* clearly is inversion insecure. If, e.g., we only allow the 
adversary to execute the encryption protocol of E*, via f3 = 1 she can decrypt 
any ciphertext, because she can still run the decryption protocol of E. 

Hence, inversion security is a property of its own right, not covered by the 
above notion of “(g, e)-security”. 

5.3 Implementing the ARK Scheme 

For actually implementing the ARK scheme, one needs to instantiate the building 
blocks with cryptographic functions. In this context, the role of the blocksizes a 
and b are important. Note that the parameter b is simply ignored in the proof of 
security. But we require the hash function H : (0, 1}* — >■ (0, 1}*” to be collision 
resistant. Thus, it must be infeasible to do close to \/^ offline calculations. On 
the other hand, due to Theorem^the ARK scheme’s construction is sound if the 
total query number q -C \/2“. This restricts the number of online calculations 
for the adversary. 

The difference between offline calculations (using any hardware the adversary 
has money for) and online calculations on smartcards allows the implementor to 
chose b > a. The current author recommends b > 160 and a > 128. 

The hash function H can be a dedicated hash function such as SHA-1 or 
RIPEMD-160. The block cipher E needs to be a 128-bit block cipher, such as 
the (soon to be chosen) DES-successor AES. In this case, the pseudorandom 
functions Fk must map a 160-bit input to a 128-bit output. One can realize 
these functions as message authentication codes (MAGs), e.g. as CBC-MACs 
based on E. Such MAGs are provably secure Q. 

Finally, for the encryption function S we have a couple of choices. S may 
either be a dedicated stream cipher, such as RG4, or a block cipher in a standard 
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chaining mode, such as CBC or OFB. If one is using a block cipher, using the 
same one as inside the smartcard is reasonable. 

Sometimes, a given application allows the security architect to drastically 
restrict the number of en- and decryptions during the lifetime of a key. If this 
number is well below 2^^, we may even use a 64-bit block cipher for if, e.g. triple 
DES. In this case, we need to observe two things: 

1. The number of en- or decryptions should never exceed a previously defined 
bound q* <C 2^^, say q* = 2^^. The card should be provided with a counter, 
to ensure the card to stop working after q* protocol executions. 

2. The encryption function S is defined to take an a-bit value as the key. Kno- 
wing one plaintext-part Qj and the corresponding ciphertext part Cj = 
Sy (Qj), the a-bit key Yj can be found by brute force, on the average in 
2““^ steps. For a = 64, one should modify the ARK scheme and choose 
another key-dependent function to stretch the 64-bit value Y to get a larger 
key. E.g., think of using the block cipher E under a new key K^, and send 
the 128-bit value {Yj, EKsiYj)) to the card. This allows a key size for S of 
up to 128 bits. During the h-phase, the adversary learns qu pairs of known 
plaintext Yj and ciphertext EK^iXj). This is no problem, since we anyway 
assume E to behave pseudorandomly. 
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Abstract. In a recent paper we developed a new cryptanalytic techni- 
que based on impossible differentials, and used it to attack the Skipjack 
encryption algorithm reduced from 32 to 31 rounds. In this paper we 
describe the application of this technique to the block ciphers IDEA 
and Khufu. In both cases the new attacks cover more rounds than the 
best currently known attacks. This demonstrates the power of the new 
cryptanalytic technique, shows that it is applicable to a larger class of 
cryptosystems, and develops new technical tools for applying it in new 
situations. 



1 Introduction 

In [bl I Y| a new cryptanalytic technique based on impossible differentials was 
proposed, and its application to Skipjack and DEAL was described. 
In this paper we apply this technique to the IDEA and Khufu cryptosystems. 
Our new attacks are much more efficient and cover more rounds than the best 
previously known attacks on these ciphers. 

The main idea behind these new attacks is a bit counter-intuitive. Unlike tra- 
ditional differential and linear cryptanalysis which predict and detect statistical 
events of highest possible probability, our new approach is to search for events 
that never happen. Such impossible events are then used to distinguish the ci- 
pher from a random permutation, or to perform key elimination (a candidate 
key is obviously wrong if it leads to an impossible event). 

The fact that impossible events can be useful in cryptanalysis is an old idea 
(for example, some of the attacks on Enigma were based on the observation that 
letters can not be encrypted to themselves). However, these attacks tended to be 
highly specific, and there was no systematic analysis in the literature of how to 
identify an impossible behavior in a block cipher and how to exploit it in order 
to derive the key. In this paper we continue to develop these attacks including 
the general technique called miss in the middle to construct impossible events 
and a general sieving attack which uses such events in order to cryptanalyze 
the block-cipher. We demonstrate these techniques in the particular cases of 
the IDEA and Khufu block ciphers. The main idea is to find two events with 
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Table 1. Summary of our attacks on IDEA with reduced number of rounds compared 
to the best previous results 



Year [Author] 


Rounds 


Type 


Chosen Time of 
Plaintexts Analysis 


1993 E31 


2 


differential 


2io 


242 


1993 E31 


2.5 


differential 


2io 


2106 


1993 ing 


2.5 


differential 


2io 


232 


1997 0 


3 


differential-linear 


229 


244 


1997 El 


3.5 


truncated-differential 


256 


267 


1998 This paper 


3.5* 


impossible-differential 


2-i^.b 


256 




4** 


impossible-differential 


237 


270 




4.5“ 


‘impossible-differential 


264 


2112 



* From the second to the middle of the fifth round. 

** From the second to the end of the fifth round. 

*** From the middle of the first to the end of the fifth round. 



probability one, whose conditions cannot be met together. In this case their 
combination is the impossible event that we are looking for. Once the existence 
of impossible events in a cipher is proved, it can be used directly as a distinguisher 
from a random permutation. Furthermore, we can find the keys of a cipher by 
analyzing the rounds surrounding the impossible event, and guessing the subkeys 
of these rounds. All the keys that lead to impossibility are obviously wrong. The 
impossible event in this case plays the role of a sieve, methodically rejecting the 
wrong key guesses and leaving the correct key. We stress that the miss in the 
middle technique is only one possible way to construct impossible events and 
the sieving technique is only one possible way to exploit them. 

In order to get a sense of the attack, consider a cipher E{-) with n-bit blocks, 
a set of input differences V of cardinality 2^ and a corresponding set of output 
differences Q of cardinality 2‘^. Suppose that no difference from V can cause 
an output difference from Q. We ask how many chosen texts should be reque- 
sted in order to distinguish E{-) from a random permutation? In general about 
2”“'^ pairs with differences from V are required. This number can be reduced by 
using structures (a standard technique for saving chosen plaintexts in differen- 
tial attacks, see 0). In the optimal case we can use structures of 2^ texts which 
contain about 2^p~^ pairs with differences from V. In this case 2 "“'?y' 22 p-i struc- 
tures are required, and the number of chosen texts used by this distinguishing 
attack is about 2”“^“'*+^ (assuming that 2p < n — q + 1). Thus, the higher is 
p + q the better is the distinguisher based on the impossible event. 

This paper is organized as follows: In Section El we propose attacks on IDEA EH. 
We develop the best known attack on IDEA reduced to 3.5 rounds and the 
first attacks on 4 and 4.5 rounds, as described in Tabled In Section0we show 
that this technique can also be applied to Khufu m- Section EJconcludes the pa- 
per with a discussion of provable security of ciphers against differential attacks, 
and describes several impossible differentials of DES, FEAL, and CAST-256. 
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2 Cryptanalysis of IDEA 

The International Data Encryption Algorithm (IDEA) is a 64-bit, 8.5-round non- 
Feistel block cipher with 128-bit keys, proposed by Lai and Massey in 1991 E0|. 
It is a modified version of a previous design by the same authors with added 
strength against differential attacks 0. 

Although almost a decade has passed since its introduction, IDEA resisted 
intensive cryptanalytic efforts Progress in cryptanalyzing 

round-reduced variants was very slow, starting with an attack on a two round 
variant of IDEA in 1993 |23| by Meier and leading to the currently best attack on 
3.5 rounds published in 1997 |3| by Borst et. al. In fSl page 79] IDEA reduced 
to four rounds was claimed to be secure against differential attacks. Table [D 
summarizes the history of attacks on IDEA and our new results described in 
this paper (all attacks in this table are chosen plaintext attacks). In addition 
to these attacks two relatively large easily detectable classes of weak keys were 
found: In El 2®^ weak keys out of the 2^^® keys were found to be detectable 
with 16 chosen plaintexts and 2^^ steps using differential membership tests, and 
in m 2®® weak keys were found to be detectable given 20 chosen plaintexts 
with a negligible complexity under differential-linear membership tests. Still the 
chance of choosing a weak key at random is about 2“®® which is extremely low. 
Related key attacks |Zj on 3.5 rounds m and on 4 rounds m of IDEA were 
developed but these are mainly of theoretical interest. Due to its strength against 
cryptanalytic attacks, and due to its inclusion in several popular cryptographic 
packages (such as PGP and SSH) IDEA became one of the best known and most 
widely used ciphers. 

Before we describe the attacks we introduce our notation. IDEA is an 8.5- 
round cipher using two different half-round operations: key mixing (which we 
denote by T) and M-mixing denoted hy M = s o MA, where MA denotes a 
multiplication-addition structure and s denotes a swap of two middle words 0 
Both MA and s are involutions. T divides the 64-bit block into four 16-bit words 
and mixes the key with the data using multiplication modulo 2^^ + l (denoted by 
©) with 0 = 2^® on words one and four, and using addition modulo 2^® (denoted 
by ffl) on words two and three. The full 8.5-round IDEA can be written as 

IDEA = Toso{so MA oTf = To so {Mo T)®. 

We denote the input to the key mixing step T in round i by X*, and its output 
(the input to M) by EL The rounds are numbered from one and the plaintext is 
thus denoted by X^. We later consider variants of IDEA with a reduced number 
of rounds which start with M instead of T. In these variants the plaintext is 
denoted by (and the output of M is then X^). See Figure ^for a picture of 
one round of IDEA. 

In the rest of this section we describe a 2.5-round impossible differential of 
IDEA (in terms of XOR differences), and chosen plaintext attacks on IDEA 

^ As usual the composition of transformations is applied from right to left, i.e., MA 
is applied first, and the swap s is applied to the result. 
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x;x; x;x: 




reduced to 4 and 4.5 rounds using this impossible differential, which are faster 
than exhaustive search. We also describe a similar attack on 3.5-rounds of IDEA, 
which is more than 2^^ times faster than the best previously known attack jOj 
and which uses 2^^ times less chosen plaintexts. One interesting feature of these 
attacks is that they are independent of many of the design details of IDEA: They 
work for any choice of the MA permutation, and for any order of the 0 and ffl 
operations in the key-mixing T. In addition they depend only marginally on the 
choice of the key-scheduling of IDEA. 

2.1 A 2.5-Round Impossible Differential of IDEA 

Our main observation is that IDEA has a 2.5-round differential with probability 
zero. Consider the 2.5 rounds MoToMoToM. Then the input difference 
(a,0,a, 0) (where 0 and a yf 0 are 16-bit words) cannot cause the output diffe- 
rence (6, b, 0, 0) after 2.5 rounds for any 6 yf 0. To prove this claim, we make the 
following observations: 

1. Consider a pair with an input difference (a, 0, o, 0) for a yf 0. In such a pair, 
the inputs to the first MA-structure have difference zero, and the outputs 
of the first MA have difference zero. Thus, the difference after the first half- 
round (s o MA) is (a, a, 0, 0) (after the swap of the two middle words). After 
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the next half-round (T) the difference becomes (c, d, 0, 0) for some c 0 and 
d ^ 0. 

2. Similarly, consider a pair with an output difference (6, 6, 0, 0) for b ^ 0 after 

2.5 rounds. In such a pair the difference before the last half-round (M) is 
(6, 0,6, 0), and the difference before the last T is of the form (e, 0, /, 0) for 
some e ^ 0 and / 0. 

3. Therefore, if the input and output differences are both as above, the input 
difference of the middle half-round (M) is (c, d, 0,0), and the output diffe- 
rence of the same half-round is (e, 0, /, 0). The difference before the swap of 
the two middle words is (e, /, 0,0). From these differences we conclude that 
the differences of the inputs to the MA-structure in the middle half-round is 
non-zero (c, d) = (e, /), while the output difference is (c 0 e, d 0 /) = (0, 0). 
This is a contradiction, as the MA-structure is a permutation. Consequently, 
there are no pairs satisfying both the input and the output differences si- 
multaneously. 

Due to symmetry there is another impossible 2.5-round differential, with input 
difference (0, a,0,a) and output difference (0,0,6, 6). 



2.2 An Attack on 3.5-Round IDEA 

Consider the first 3.5 rounds of IDEA T o (M o T)^. We denote the plaintext by 
and the ciphertext by V^. The attack is based on the 2.5-round impossible 
differential with two additional T half-rounds at the beginning and end, and 
consists of the following steps: 

1. Choose a structure of 2^^ plaintexts X^ with identical X 2 , identical Xj, and 
all possibilities of X} and X\. 

2. Collect about 2^^ pairs from the structure whose ciphertext differences satisfy 
Yi' = 0 and r/' = 0. 

3. For each such pair 

a) Try all the 2^^ possible subkeys of the first T half-round that affect Xl 
and Ag, and partially encrypt Xf and Ag into Yi and Yg^ in each of 
the two plaintexts of the pair. Collect about 2^® possible 32-bit subkeys 
satisfying Y^' = Yg^'. This step can be done efficiently with 2^® time and 
memory complexity. 

b) Try all the 2®^ possible subkeys of the last T half-round that affect Xf 
and X 2 , and partially decrypt Y)^ and Yf into Xf and A| in each of 
the two ciphertexts of the pair. Collect about 2^® possible 32-bit subkeys 
satisfying Xf' = A|'. This step can be done efficiently with 2^® time and 
memory complexity. 

c) Make a list of all the 2®^ 64-bit subkeys combining the previous two 
steps. These subkeys cannot be the real value of the key, as if they do, 
there is a pair satisfying the differences of the impossible differential. 

4. Repeat this analysis for each one of the 2®^ pairs obtained in each structure 
and use a total of about 90 structures. Each pair defines a list of about 2®^ 
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incorrect keys. Compute the union of the lists of impossible 64-bit subkeys 
they suggest. It is expected that after about 90 structures, the number of 
remaining wrong key values is: 2®^ • (1 — « 2®^ • « 0.5 and 

thus the correct key can be identified as the only remaining value. 

5. Complete the secret key by analyzing the second differential (0, a, 0,a). Si- 
milar analysis will give 46 new key bits (16 bits out of 64 are in common 
with the bits that we already found, and two bits 17 and 18 are common 
between the 1st and 4th rounds of this differential). Finally guess the 18 bits 
that are still not found to complete the 128-bit secret key. 

This attack requires about 2^®-® chosen plaintexts and about 2®^ steps of analysis. 

A naive approach here (which works for any key schedule) requires 2®^ steps 
and 2®"^ memory. A memory-efficient implementation requires only 2^® memory. 
In the particular case of rounds 2-4 of the key schedule of IDEA the subkeys of 
the 2nd and the 4th rounds have 11 key bits in common. Using this observation 
the attack requires only 2®® steps and 2®^ memory. 

2.3 An Attack on a 4-Round IDEA 

The attack is also applicable to IDEA reduced to 4 rounds: {M o T)^, from 
second to the fifth round (inclusive). We denote the plaintext by and the 
ciphertext by A®. Depending on the starting round and on the differential being 
used ((a, 0, a, 0) or (0, a, 0, a)), there is a varying amount of overlap between the 
subkey bits. In the case of our choice (from second to the fifth round, with the 
first differential), we will work with subkeys: 

Zi 2[97. ..112],Z|[26...41],Zi®[76...91],Z|[92...107],Zf[12. ..27],Z|[28...43], 



these have 69 distinct key bits out of 6 • 16 = 96. The attack guesses the two sub- 
keys , Zq of the last MA structure, and for each guess performs the previous 
attack on 3.5 round IDEA. More precisely, 

1. For each guess of Z|, Z|: 

a) Decrypt the last half round of all the structures, using the guessed sub- 
keys. 

b) For each structure find all pairs with zero differences in the third and 
fourth words, leaving about 2®^ pairs per structure. 

c) For each pair: 

i. Notice that at this point we already know due to the subkey 
overlap. Thus, we calculate the difference of the third words: 

(Zf ffl A|) 0 (Z| ffl Xf), 

and find the key which produces the same difference in the first 
words: 

{Zf © Xf) © {Zf © Af ). 

On average only one Zf is suggested per pair. 
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ii. Similarly find the pairs of keys Z\ and Z\ which cause equal diffe- 
rences at the 5th round. Since Z^ and Zf share eleven key bits, we 
are left with about 2® choices of subkey pairs, and thus with about 2® 
choices of newly found 37 subkey bits. These choices are impossible, 
d) We need about 50 structures to filter out all the wrong keys (this is 
because we fix many key bits at the outer-most loop): 




2. After analyzing all the structures only a few possible subkey values remain. 
These values are verified using auxiliary techniques. 

This attack requires about 50 • 2^^ « 2^® chosen plaintexts packed into structures 
as in the previous section. The total complexity of this attack consists of about 
232 . 238 half-round decryption (MA) steps which are equivalent to about 2®^ 4- 
round encryptions plus about 2®^ • 2®^ • 2® « 2^^ simple steps. When these steps 
are performed efficiently, they are equivalent to about 2^® 4-round encryption 
steps, and thus the total time complexity is about 2^° encryptions. 



2.4 An Attack on a 4.5-Round IDEA 

In this section we describe our strongest attack which can be applied to the 4.5 
rounds of IDEA described by: M o (T o M)'^ which start after the first T half- 
round. We denote the plaintext by and the ciphertext by A®. In addition 
to the 64 key bits considered in the previous section we now need to find the 
subkeys of the two additional M half-rounds. We observe however, that only 16 
of these key bits are new, and the other 48 bits are either shared with the set 
we found in the previous section, or are shared between the first and the last 
half-rounds. Therefore, it suffices to guess 80 key bits in order to verify whether 
the impossible differential occurs. These key bits are 12-43, 65-112, covering the 
subkeys: 

Zl[Qb . . . 80], Z^[81 . . . 96],Z^[97 . . . 112], Z|[26 . . . 41], 

[76 . . . 91], Z|[92 . . . 107], Zl[l2 . . .27], Z|[28 . . .43]. 

The attack consists of the following steps: 

1. Get the ciphertexts of all the 2®"* possible plaintexts. 

2. Define a structure to be the set of all 2®^ encryptions in which A| and 
A| are fixed to some arbitrary values, and Xf and A| range over all the 
possible values. Unlike the previous attacks, these structures are based on 
the intermediate values rather than on the plaintexts. 

3. Try all the 2®° possible values of the 80 bits of the subkeys. For each such 
subkey 

a) Prepare a structure, and use the trial key to partially decrypt it by one 
half-round with the keys Zl and Z^ to get the 2®^ plaintexts. 
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b) For each plaintext find the corresponding ciphertext and partially de- 
crypt the last two half-rounds by the trial subkeys (Z|, Zq and Zf, Z^)- 
Partially encrypt all pairs in the structure with the subkeys Zf and Z^. 

c) Check whether there is some pair in the structure which satisfies the 

64-bit condition Xf' = Y^^' = 0, and Yf = 0. 

d) If there is such an impossible pair, the trial 80-bit value of the subkeys 
cannot be the right value. 

e) If there is no such pair in the structure, try again with another structure. 

f) If no pairs are found after trying 100 structures, the trial 80-bit value is 
the real value of the 80 bits of the key. 

4. Assuming that an unique 80 bit value survives the previous steps, the re- 
maining 48 bits of the key can be found by exhaustive search. 

This attack requires 2®'* plaintexts, and finds the key within 2^^^ steps using 
about 2^^ memory. This is about 2^® times faster than exhaustive search. See 
Table Q] for a summary of our attacks on IDEA compared to the best previous 
attacks. 

3 Attacks on Khufu 

Khufu and Khafre are two 64-bit block 512-bit key ciphers designed by Mer- 
kle |24] with a fast software implementation in mind. Khufu is faster than Khafre 
due to a smaller number of rounds but has a much slower key-setup. The strength 
of Khufu is based on key-dependent 8x32-bit S-boxes. These are unknown to an 
attacker and thus defy analysis based on specific properties of the S-boxes. The 
only additional way in which the key is used is at the beginning and at the 
end of the cipher, where 64-bit subkeys are XORed to the plaintext and to the 
ciphertext. The cipher is a Feistel cipher, so the input to a round is split into 
two 32-bit halves L and B. Each round consists of the following simple steps: 

1. Use the least significant byte of L as an input to the S-box: S[LSB{L)]. 

2. XOR the output of the S-box with R: R = R(Q S[LSB{L)]. 

3. Rotate L by several bytes according to the rotation schedule. 

4. Swap L and R. 

The S-box is changed every eight rounds in order to avoid attacks based on 
guessing a single S-box entry. The rotation schedule of Khufu for every eight 
rounds is: 2, 2, 1, 1, 2, 2, 3, 3 (byte rotations to the right). Since our attack works 
equally well for any rotation schedule which uses all four bytes of each word 
every eight consecutive rounds, we simplify the description of the attack by as- 
suming that all the rotations are by a single byte to the left. A description of 
this simplified version of Khufu can be found in Figure 0 Khafre differs from 
Khufu only in two aspects: its S-boxes are known, and it XORs additional 64-bit 
subkeys to the data every eight rounds. The best currently known attack on 
Khafre is by Biham and Shamir |^, which requires about 1500 chosen plaintexts 
for attacking 16 rounds, and about 2^^ chosen plaintexts for attacking 24 ro- 
unds. The best attack on Khufu is by Gilbert and Chauvaud m- It attacks the 
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Aux Key 3 



Aux Key 4 




Fig. 2. Description of Khufu and Khafre 



16-round Khufu, and requires about 2"*^ chosen plaintexts and 2^^ operations 
(preliminary information on the secret key can be derived with about 2^^ cho- 
sen plaintexts in 2^^ steps). It is believed that Khufu is stronger than Khafre, 
since Khufu has secret key-dependent S-boxes, which prohibit attacks based on 
analysis of specific S-boxes. 

Interestingly the approach described in this section is not very sensitive to 
the differences between these two ciphers, and works well for both of them since 
it is independent of the concrete choice of the S-boxes and (surprisingly) does 
not assume their knowledge by an attacker. 



3.1 Impossible Differentials of Khufu and Khafre 

In this section we describe long impossible differentials for Khufu and Khafre. 
The impossibilities stem mainly from the fact that the avalanche effect of the 
difference can be postponed by eight rounds. This leads to many eight round 
differentials with probability one, whose concatenation is contradictory. Due to 
the byte-oriented structure, these differentials come in sets of 256 or larger, 
and allow tight packing into structures. We study mainly the differentials with 
an eight byte input difference 000000+0, where ‘0’ denotes a byte with zero 
difference, and ‘+’ denotes a byte with arbitrary non-zero difference; is later 
used to denote a byte with any (zero or non-zero) difference. However, two byte 
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Table 2. Impossible Differentials of Khufu and Khafre 



Rounds 


Input Output 


14 


000000+0 7^ *00**00* 


15 


000000+0 7^ 000**00* 


16 


000000+0 7^ 000*000* 


17 


000000+0 7^ 0000000* 



and three byte input differences are possible as long as p + q remains constant 
(see the relevant discussion in the Introduction). Notice that a XOR of two 
different S-box entries necessarily looks like ++++, since the S-boxes are built from 
four permutations. Let us study one of these differentials in some more detail. 
To simplify presentation, we assume that Khufu and Khafre are implemented 
without swaps, and that the S boxes are used alternatingly in the left half and 
the right half. 

The differential we describe below spans 16 rounds of Khufu and Khafre. It 
covers a set of 256 input differences for which a set of 2^® output differences is 
impossible. 

1. Consider a pair of inputs with difference 000000+0. After eight rounds this 
difference is always of the form ++++00+0. 

2. Similarly consider a pair with the output difference 000*000+ after the 16th 
round. This output difference can only be derived from a difference 00*000*0 
at the output of the 10th round, as the differing S bytes do not affect any S 
box between these rounds. 

3. Therefore, the output difference of the S box in round 9 has the form 
OO+O0OOO*=OO+*. 

4. However, the input difference of the S box in round 9 must be non-zero, and 
due to the design of the S boxes, the output differences must have the form 
++++, which contradicts the form 00+*. 

This impossible differential is described in Figure 0 The above representation 
ensures that we write intermediate differences in the same order as in the figure. 
A 17-round impossible differential OOOOOO+Ot^OOOOOOO* is reached by adding 
one round to this 16-round impossible differential, while canceling the difference 
in the left half of the ciphertexts. The impossible differentials of this kind are 
summarized in Table 0 

3.2 The New Attacks 

The best known attack against Khufu can attack up to 16 rounds and the best 
known attack against Khafre can attack up to 24 rounds. Using the impossible 
differential described above, we can attack Khufu and Khafre with up to 18 
rounds. Consequently, the new 18-round attack is only interesting in the case of 
Khufu. For the sake of simplicity, we describe only a less-complicated attack on 
Khufu with 16 rounds which requires 2^® complexity. 
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Fig. 3. The 16-Round Impossible Differential of Khufu and Khafre (simplified by equal 
rotations in all rounds). In this figure white squares represent zero differences, gray 
squares represent the zero differences which are also input bytes to the S boxes, and 
black squares represent bytes of type + or * 
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This attack uses the 15-round impossible differential 000000+0 t 4000**00*. 
Since the S-boxes are unknown, we can always assume that the bytes of the 
last subkey can be arbitrarily set to zero, yielding an equivalent (but modified) 
description of the corresponding S-boxes (and using a modified first subkey) . 

1 . Encrypt structures of 256 plaintexts differing only in the 7th byte (we count 
the bytes of the block from left to right). 

2. Check all the 2^^ pairs contained in the structure and retain only those 
ciphertext differences of the form +++*00+* (i.e., discard all the non-zero 
differences in the fifth and sixth bytes and all the zero differences in the 
second and third bytes of the ciphertexts). On average about half a pair 
remains for each structure. 

3. Denote the inputs to the S-box used in the last round in a particular pair by 
i and j. Denote the ciphertext difference by C = C(, C 2 , . . . , Cg. For each 
remaining pair the following constraint on the three first bytes of S'[i] © S[j] 
cannot be satisfied: 

(5[*]©5b-])i,2,3 = C;,2,3 

About two structures (2® chosen plaintexts) suffice to find the first such 
constraint. About 2^^ constraints are required in order to actually derive the 
full description of three of the four output bytes of an S-box. Thus, this attack 
requires about 2^® chosen plaintexts. The rest of the S box information can be 
derived by auxiliary techniques. 

It is interesting to note that these attacks are particularly sensitive to re- 
dundancy in the plaintexts. If the distribution of the plaintexts is not uniform, 
then in some cases we can efficiently convert these chosen message attacks into 
known-plaintext and even ciphertext-only attacks, as described in 0. 

4 Concluding Remarks 

Since the introduction of differential cryptanalysis in 1990 various approaches to 
the design of ciphers with provable security against this attack were suggested 
(see for example P2H22I). One way of proving a cipher to be secure against 
differential attack is to show an upper bound on the probability of the best 
differential. For example in m for a Feistel cipher with a bijective F function 
the probability of a three-round (or longer) differential was proved to be smaller 
than 2p^, where p is the highest probability for a non-trivial one-round differen- 
tial0 This result makes it possible to construct Feistel ciphers with few rounds 
which are provably resistant against conventional differential cryptanalysis (for 
example, four rounds with best differential probability < 2®^). Examples of such 
ciphers are ICAf |27pl and MISTY ■ 

Notice however that any four and five round Feistel cipher has lots of impos- 
sible differentials, which are independent of the exact properties of the round 

^ A better bound of was proved later by Aoki and Ohta. 

^ Recently broken by high-order differential techniques EMSI- 
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function. For example, if the round function is bijective then for any value of 
a yf 0, we have an impossible five-round differential (a, 0) (a, 0), since it causes 

a zero output difference at the third round, but the round function is bijective 
and the input difference of this round is non-zero (this was already observed 
in ini in the case of DEAL). 

Using the properties of the round function one can usually extend the im- 
possible differentials to cover even more rounds of a cipher. In the case of 
DES we can devise 7-round impossible differentials which hold for any choice 
of the S boxes, i.e., they still hold even if the S boxes are replaced by ar- 
bitrary (possibly unknown or key dependent) choices, and even if their order 
becomes key dependent (for example as in ^), or the S boxes change from ro- 
und to round. Let 0 be the (XOR) linear subspace spanned by the elements 
of {00400000a;, 00200000a;, 000000023,}, and let € 0 and r/ € 0 (B where 
^ = 00000004a;. Then, the differentials (/i,0) 7^ (??,0) and (r/,0) ^ are im- 
possible for any such choice of and r]. Consider the plaintext difference (fJ.,0) 
and the ciphertext difference (77, 0). The input and output differences of the F 
function in the first round are zero. The input difference of the F function in 
the second round is fi, and thus only one S box is active in this round. The 
output difference of this S box may activate up to six S boxes in the next round, 
not including S3 and S8. As the active bit in ^ enters S8, this input bit of the 
fourth round is not affected by neither ^ nor by the output difference of the third 
round. Similarly, this bit is affected by the ciphertext difference, as it is active 
in 77, and it cannot be canceled by the output difference of the fifth round, due 
to the same reasons that it cannot be affected by the output difference of the 
third round. Therefore, this bit is both 0 and 1 in the input of the fourth round, 
which is a contradiction. 

FEAL has three 3-round characteristics with probability one. Using 

two such characteristics, with additional three rounds in between results in the 
following impossible differential (where a subscript x denotes a hexadecimal 
number) : 

(02000000a;, 8080000a;) 7 ^ (02000000a,, 8080000a,). 

In this case the characteristics with probability one ensure that the data after ro- 
und three and before round seven have the same difference: 
(02000000a;, 8080000a;). Therefore, the output difference of the E-function in ro- 
und five is zero, and thus the input difference of F in this round is zero as 
well (since F in FEAL is bijective). The input difference of F in round four is 
02000000a; and the output difference must be 80800000a, which is impossible in 
the F function of FEAL (for example bit 19 of the output always differs for the 
specified input difference). 

CAST-256 has 20-round impossible differential (17 forward rounds and 3 
backward rounds, or vice versa) with inputs and outputs which differ only by 
one word. 

Another general belief is that large expanding S-boxes (n bits of input, m 
bits of output, n <C m) offer increased security against differential attacks. In 
particular 8x32 bit S-boxes are very popular, and can be found in Khufu, Khafre, 



Miss in the Middle Attacks on IDEA and Khufu 



137 



CAST, Blowfish, Twofish and other ciphers. However, the difference distribution 
tables of such S-boxes contain very few possible entries - at most 2^^, and all the 
other pairs of input/output differences are impossible. This facilitates 

the construction of impossible differentials and can thus make such schemes more 
vulnerable to the new type of attacks described in this paperfl 
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Abstract. We introduce “mod n cryptanalysis,” a form of partitioning 
attack that is effective against ciphers which rely on modular addition 
and bit rotations for their security. We demonstrate this attack with a 
mod 3 attack against RC5P, an RC5 variant that uses addition instead 
of XOR. We also show mod 5 and mod 257 attacks against some versions 
of a family of ciphers used in the FireWire standard. We expect mod 
n cryptanalysis to be applicable to many other ciphers, and that the 
general attack is extensible to other values of n. 



1 Introduction 

Nearly all modern statistical attacks on product ciphers work by learning some 
way to distinguish the output of all but the last rounds from a random permu- 
tation. In a linear attack, there is a slight correlation between the plaintext and 
the last-round input; in a differential attack, the relationship between a pair of 
inputs to the last round isn’t quite random. Partitioning attacks, higher-order 
differential attacks, differential-linear attacks, and related-key attacks all fit into 
this pattern. 

Mod n cryptanalysis is another attack along these lines. We show that, in 
some cases, the value of the last-round input modulo n is correlated to the 
value of the plaintext modulo n. In this case, the attacker can use this correla- 
tion to collect information about the last-round subkey. Ciphers that sufficiently 
attenuate statistics based on other statistical effects (linear approximations, dif- 
ferential characteristics, etc.) are not necessarily safe from correlations modulo 
n. 



1.1 The Rest of This Paper 

The rest of this paper is organized as follows. First, in Section 21 we introduce 
mod 3 cryptanalysis and develop the tools we need to attack RC5P. Next, in 
Section 21 we develop the attack on RC5P and show how it can be applied 
in a reasonably efficient way to break RC5P variants with quite a few rounds. 
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Section ^analyzes M6, a family of ciphers proposed for digital content protection. 
Finally, in Section 0 we discuss what we’ve discovered so far, consider some 
generalizations to our techniques, and point out a number of interesting open 
questions whose answers we hope will be the subject of future research. 

Also, in Appendix El we demonstrate why our definition of bias is the right 
one and recall some important facts about the test. 

2 Tools for Mod 3 Cryptanalysis 

In mod 3 cryptanalysis, we trace knowledge of the mod 3 value of some part of 
a cipher’s block through successive rounds of the cipher, leaving ourselves with 
some information about the input to the last round or two that lets us distin- 
guish it from a randomly-selected block. The attack is conceptually very similar 
to Matsui’s linear cryptanalysis |Mat94IBih95IKB.94IKB.95IKBM| , though it is 
properly included in the class of partitioning attacks |HKM95IHM97j develo- 
ped by Harpes and Massey. We also draw somewhat from Vaudenay’s statistical 
cryptanalysis 

In this paper, we will use the shorthand term “mod 3 value” to stand for the 
value we get when we take some selected 32-bit part of a block, and reduce it 
modulo 3. A mod 3 value may thus be only 0, 1, or 2. In a randomly-selected 
32-bit block, we would expect 0, 1, and 2 to occur as mod 3 values with almost 
identical likelihood. (If we automatically discarded any block with the value 2^^ — 
1, we would have perfectly equal probabilities.) As a block cipher’s successive 
rounds operate on its block, the block should become harder and harder to 
distinguish from a randomly-selected block, without knowledge of the cipher’s 
round keys. Mod 3 cryptanalysis works when the block’s mod 3 value is still not 
too hard to distinguish from that of a random block, very late into the cipher. 
(In the same sense, linear cryptanalysis works when the parity of some subset of 
the block’s bits is still not too hard to distinguish from that of a random block, 
very late into the cipher.) 

2.1 Approximating Rotations 

The insight that first led us to consider mod 3 cryptanalysis at all involved the 
behavior of the mod 3 value of some 32-bit word, X, before and after being 
rotated by one bit. When we consider A as a 32-bit integer, and X 1 as 
X rotated left by one bit, we can rewrite the effects of the rotation in terms of 
integer arithmetic: 



The first thing to notice is that 2^^ = 1 mod 3. Thus, X 1 = 2X mod 3, 
because 2X -I- 1 — 2^^ = 2X mod 3. 
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From this, we can derive the effect of any larger number of rotations. For 
instance, 

a: «< 2 = (A: «< 1) «< 1 = 2 X 2 X X = X mod 3 
In general, we have 

X «< n = 2”X = 2” ^X mod 3 



so rotating by any odd number of bits multiplies the mod 3 value by 2, while 
multiplying by any odd number of bits leaves the mod 3 value unchanged. This 
means that when we know the number of bits a 32-bit block was rotated, and 
what its input mod 3 value was, we also know what its output mod 3 value was. 

Next let us consider the case where we know the input mod 3 value, but not 
the rotation amount. We do not lose all knowledge of the output mod 3 value. 
Indeed, some traces of X leak, because we know 



X n mod 3 = 



2X mod 3, if n odd 
X mod 3, if n even 



Note that in the case of X mod 3 = 0, we have X n = 0 mod 3, regardless 
of n. Thus Pr[X n = X mod 3] = 4/6 when X is uniformly distributed, and 
we have some incomplete knowledge on X n. 

We can express the propagation of partial information using the notation of 
probability vectors. Let the probability vector px represent the distribution of X, 
so that the j-th component of px is Pr[X = j]. Then, for example, if X = X 
1 and Px = [0, 1/2, 1/2], we find that py = [0, 1/2, 1/2]. As another example, a 
uniformly-distributed random variable U is represented by the probability vector 
P[/ = [l/3,l/3,l/3]. 

It is also tempting to think of operations such as in terms of their transi- 
tion matrix M (where Mij = Pr[/(X) = j\X = i]). However, as will be discussed 
below, there are subtle pitfalls with such an approach. 

In some cases it is also useful to view rotations as a multiplication modulo 
232 _ 2^^ observation is that we have the relation 

X j = 2^x mod (2^^ — 1) 



for rotations left by j bits. Reducing both sides modulo 3, we obtain x j = 
2^x mod 3. (This is valid because 3 divides 2^^ — 1.) This is another way to derive 
the mod 3 approximation of rotations given above. 

We can also see that we get a good mod p approximation x j = 2^ x mod p 
for bit-rotations whenever p divides 2^^ — 1. Section El explores this direction in 
more detail. 



2.2 Approximating Addition Modulo 2^^ 

A similar analysis works for addition mod 2^^ . Consider a simple description of 
mod 2^^ addition in terms of integer addition: 



32 



X -k X mod 2' 



X -k X, if there was no carry out 

X -k X — 2^^ , if there was a carry out 
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Since 2^^ = 1 modulo 3, this can be rewritten as 



{X + Y mod 2^^) mod 3 



X + Y mod 3, if there was no carry out 
X + Y — 1 mod 3, if there was a carry out 



Sometimes, we know the distribution of the carry. For example, we might 
know that the high-order four bits of Y are all ones, and so know that the 
carry-out probability is around 0.98. We can then rewrite this approximation as: 



X + Y mod 2^^ mod 3 = 



X + Y mod 3, with prob. 0.02 

X + Y — I mod 3, with prob. 0.98 



2.3 Biases and the I 2 Norm 

As we discussed above, the probability vector p^ = (1/3, 1/3, 1/3) is approxi- 
mately what we would expect from a random 32-bit block. It would be nice to 
have some measure of distance from the uniform distribution. In this paper, we 
use the I 2 norn;0 as our measure of bias. The bias of a probability vector px is 
defined using this distance measure as 

Wpx-PuW^ = -Pu[j]f- 

j 

Intuitively, the larger the bias, the fewer samples of a block described by px 
are necessary to distinguish those blocks from a random sequence of blocks. Ap- 
pendix ^motivates and formalizes this measure of bias: we find that 0{l/\\px — 
P[/|P) samples suffice to distinguish the distribution px from uniform and that 
the test may be used to implement the distinguisher. 



3 Mod 3 Cryptanalysis of RC5P 

RC5 is a conceptually simple block cipher designed by Ron Rivest |l{iv and 
analyzed in lkV9,^lkM9flMel98llik98lkVlia . The cipher gets its strength from 
data-dependent rotations, a construct also used in Madryga |Mad84| . Akelarre 
[lAClIVI FUTlj . RC6 IRRb-l-98l( and Mars IBCU-l-951 . Presently, 16 rounds 

(each RC5 round consists of two Feistel rounds) of RC5 is considered to be 
secure. RC5P is an RC5 variant described in and conjectured to be as 

secure as RC5. It is identical to RC5, except that the XORS in RC5 are replaced 
with additions modulo 2^^ in RC5P. 

In this section, we discuss a mod 3 attack on RC5P. We have implemented 
simplified versions of the attack on RC5P with up to seven full rounds (fourteen 
half rounds), but without input or output whitening. (This attack took about 
three hours on a 133 MHz Pentium.) We conjecture that this attack might be ex- 
tended to as many as nineteen or twenty rounds for at least the most susceptible 
keys. 

The RC5P round function is as follows: 



^ Also called the Euclidian Squared Distance in mm- 
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L:= L + R 
L := L R 
L != L -\- sk‘2i 
R-- R + L 
R :^ R L 
R '■= R + s/C2z+l 

3.1 Modeling the RC5P Round Function 

We initially tried to predict the bias of multiple RC5P rounds using several ma- 
thematical tools (transition matrices, matrix norms, second-largest eigenvalues, 
etc.). However, we found that precise analytical methods were surprisingly diffi- 
cult to develop for RC5P, because of a lack of independence between rounds: in 
technical terms, RC5P is not a Markov cipher with respect to mod 3 approxima- 
tions. As a result, multiplying the biases or transition matrices of each individual 
round gives incorrect answers. 

For these reasons, we abandoned our pursuit of precise mathematical models 
and turned to empirical measurements. Let (Pl,Pr) and {CltCr) represent 
the plaintext and ciphertext after encrypting by R rounds. For convenience, we 
choose plaintexts so that {Pl mod 3, Pr mod 3) is fixed: in practice, each of the 
nine possibilities give about the same test results. Then we empirically compute 
the probability distribution of (Cl mod 3, Cr mod 3) and measure its bias using 
the test. More precisely, we count the number of texts needed for the x^ 
score to exceed a certain threshold. Since (Cl mod 3, Cr mod 3) has 9 possible 
outcomes, we use a chi-square test with 8 degrees of freedom. To give some 
baseline figures, a threshold of X 2-16 s “ probability of about 2“^® of 

occurring in a random sequence of inputs, while a test value of X 2-32 s “ 
a probability of about 2“^^ of occurring in a random sequence of inputs. 

We used this technique to estimate the number of texts needed to distinguish 
R rounds of RC5P, for 1 < i? < 8. For each choice of R, we ran 50 trials of the 
previous test and computed the average number of texts needed as well as a 90% 
confidence interval. Our measurements are presented in Figure Q note that the 
y axis is scaled logarithmically. 



3.2 Mounting the Attack 

Overview of the Attack Here, we discuss a chosen-plaintext chi-square attack 
on RC5P without pre- or post-whitening. The attack can clearly be applied with 
the whitening, or with only known plaintexts; in each case, we require more texts 
and more processing. In the final version of this paper, we will specify attacks 
on more versions of the cipher, as well as having more complete experimental 
data. 

We make use of the mod 3 values of the two 32-bit halves of the RC5P block 
to select one of nine partitions into which to put the block. Given a sequence of 
N RC5P blocks, we can count how many fall into each of the nine partitions, and 
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Fig. 1. Nnmber of known texts needed to distinguish R rounds of RC5P from random. 



use this count to compute a chi-square score, as discussed above. Informally, the 
chi-square score allows us to distinguish between a uniformly random selection of 
these partitions, as we would expect from the output of a random permutation, 
and a biased selection of these partitions, as we would expect from a cipher with 
a good mod 3 approximation available. 

The attack works as follows: 

1 . Request the encryptions of N chosen plaintexts, where N is chosen according 
to the criteria given below. 

2. For each resulting ciphertext, try all 48 possible combinations of mod 3 value 
and high-order four bits for the last half-round’s subkey. Use this guess to 
predict the mod 3 values of both 32-bit halves of the block before the last 
half round, and keep count of these values for each guess. 

3. Use these counts to calculate a chi-square score with eight degrees of freedom, 
based on splitting the block into nine possible categories based on the two 
mod 3 values. 

4. Select the partial guess with the highest chi-square score as the most likely 
guess. 

5. Assuming the above guess is correct, begin the process again, this time gues- 
sing the next six bits of subkey. Continue this process, guessing six bits of 
subkey at a time, until the last four bits of subkey are remaining. Guess 
those four bits and test the guesses in the same way. 

6. The result is a guess which is likely to be correct of the last half-round’s sub- 
key. Using this value, we can peel the last half-round off all the ciphertexts. 
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and use the resulting values to mount the attack again on a version of the 
cipher with one fewer half-rounds. 



Choosing the Plaintexts We choose the plaintexts to try to maximize the bias 
in the selection of a partition after N rounds. In practice, this means attempting 
to bypass most of the effects of the first full round of RC5P. We thus choose 
Pl, Pr such that: 

1. The high-order eight bits of P^ and Pr are all zeros. 

2. The low-order five bits of Pl and Pr are all zeros. 

3. Pl mod 3 = Pr mod 3 = 0. 

To understand why this makes sense, we must consider the first round of 
RC5P without the whitening in some detail. Recall that the operations are: 



1. 


L 


= L + R 


2. 


L 


= L R 


3. 


L 


— L -\- sk() 


4. 


R 


= R+L 


5. 


R 


= R-^ L 


6. 


R 


= R + ski 



When the high-order eight bits of both Pl and Pr are zeros, the high-order 
seven bits of L after step one must be zeros, and there is no chance of a carry 
in that addition. Recall that when there is not a carry in mod 2^^ addition, the 
mod 3 values of the addends can be added together mod 3 to get the correct 
mod 3 value for the sum. Because the low five bits of R are zeros, there is no 
rotation in step two. Thus, in step three, we know that the high-order seven bits 
of L are zeros before the addition. Unless the high-order seven bits of sko are 
all ones, there will never be a carry in that addition, either. Thus, L will, for 
127/128 possible keys, have the same mod 3 value for all inputs chosen as we 
have described. 

The high-order seven bits of L after the third step are no longer known, but 
will be closely related for all the texts. There will be only two possible values 
for these high seven bits, depending on whether there was a carry into them in 
step three’s addition. 

The high order eight bits of R going into the addition in the fourth step are 
zeros; thus only if all eight high-order bits of L are ones will there ever be a 
carry in this addition. For nearly all values of sfco, it will thus not be possible 
for there to be a carry in this addition, either. 

The low-order five bits of L after the addition in step three will be constant. 
Thus, the rotation in step five will be by a constant amount. Rotation by a 
constant amount of a constant mod 3 value will yield a constant mod 3 value. 
Thus the mod 3 value of R will be constant for nearly all keys after step five. In 
step six, there is an addition with sk\. After this step, it will be possible for R 
to have one of two mod 3 values. Depending on the high-order few bits of sk\, 
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R may be nearly balanced between these two values, or may be strongly biased 
towards one or the other of them. 

The result of the way we choose the plaintexts is thus, for nearly all keys, to 
bypass nearly the first full round of RC5P for the sake of our mod 3 approxima- 
tion. Instead of the possibility of being in all nine different partitions after the 
first round, for most keys the texts can only be in one of two partitions after the 
first round. 



Making the Initial Guess Having collected N ciphertexts, corresponding to 
the N chosen plaintexts, we now make an initial guess of the high-order four 
bits and mod 3 value of the final half-round’s subkey, which we will refer to as 
skf. There are 48 possible values for this guess; we try each guess against all N 
ciphertexts. 

Each guess suggests a value of the right half of the block before the final 
half-round was applied. Consider the decryption of the final half-round: 

1. R := R— skf 

2. R := R ^ L 

3. R:= R- L 

In the first step, we must determine, based on the known value of R and 
the guessed parts of skf, what the result of the subtraction will be. This is 
dependent upon the mod 3 value of skf, and also its high-order four bits. In 
the second step, we must use the known rotation amount from L to determine 
what the resulting R value will be. The resulting mod 3 value of R is determined 
only by the low-order bit of L. However, the total rotation amount determines 
which bits of R from the first step will end up in the high-order bits. Finally, 
in the third step, we must use the known L value, along with what is known 
about the R value input, to determine the result of this final subtraction. For 
most ciphertexts, we know only the mod 3 value of R going into this operation. 
However, for some rotation amounts in step two, we also know the likely values 
for the high-order few bits of R going into the third step. 

We can model this by noting that there are many different subkey words 
skguess which share the same high-order four bits and mod 3 value as a given 
guess to be tested. We can choose a reasonable representative from the set of 
these, and get a fair approximation to its behavior. We thus derive skmodei by 
setting its high-order four bits to the value required by the guess, setting the 
rest of the bits to alternate between zero and one bits, and then incrementing 
the result as necessary to get the guessed mod 3 value. We then carry out a trial 
decryption with this partial key guess, get a resulting pair of mod 3 values, and 
keep count of how many of each value results from the trial decryption. 

To distinguish between the right and wrong partial values for skf, we use 
those counts to compute chi-square scores, and choose the highest chi-square 
score as the most likely value. For sufficiently large values of N , we have seen 
experimentally that this is very likely to select the right partial value for skf. 
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Table 1. Estimates of Difficulty of Attacking RC5P With Many Rounds. 



Weak Key 
Rounds 


Average Key 
Rounds 


Texts 


Work 


11 


8 


2^5 


WVW 


13 


10 


237 


5 X 2^® 


15 


12 


245 


5 X 2®i 


17 


14 


253 


5 X 2 ^^ 


19 


16 


261 


5 X 2®^ 



Continuing the Guess Assuming the previous guess was correct, we can ex- 
tend this guess. We guess the next four or six bits of skf, updating skmodei 
appropriately. The new skmodei is very little improved at determining whether 
there will be a borrow in the subtraction of the first step, but for some rotation 
amounts in the second step, it has a strong impact on determining whether there 
will be a borrow in the subtraction in the third step. For each guess, we again 
keep count of the mod 3 values of the two halves after it is applied, and we again 
select the guess with the largest chi-square value. 



Continuing the Attack After the full skf value is known, we peel off the last 
half-round, and apply the attack anew to the resulting cipher. 



3.3 Resources Required for the Fhll Attack 

According to our preliminary experiments, the full attack has an acceptably high 
probability of success using N texts, where N is the number of texts necessary 
to get a chi-square value high enough that in practice, it simply could not have 
occurred by chance. 

Our experimental data suggest that each additional round requires roughly 
sixteen times as many texts to get about the same score on average, and that 
there are especially vulnerable keys for which we can expect to get sufficiently 
high scores even with an extra two to three rounds, with the same N. 

The work required for the attack is approximately 5 x 2® x A. Table Q] shows 
our predictions for the approximate workfactor and number of texts needed to 
break RC5P. 



3.4 Results and Implications 

There are several practical implications of our results: 

1. RC5P, the RC5 variant with XORs changed to mod 2^^ additions, is much 
less secure than was previously believed mM- We suggest a minimum of 
22 rounds for reasonable security when used with a 128-bit key. 
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2. Other ciphers which use rotations and additions, but no multiplications or 
XORs, are likely to also be vulnerable. In particular, all elements other than 
additions and rotations in such ciphers should be carefully reviewed to see 
whether they have good mod 3 approximations. In some cases modular mul- 
tiplication may also be vulnerable, depending upon the modulus: for in- 
stance, fix) = a- X mod (2^^-|-5) is vulnerable to mod 3 cryptanalysis, since 
232 + 5 = 33 X 47 X 3384529. 

3. Multiplication mod 232 gg done in RC6, and XORing as done in both RC6 and 
RC5, are both very difficult to approximate mod 3. Hence, these operations 
generally make ciphers resistant to the attack. However, the specific cipher 
designs need to be reviewed to verify that they are used in a way that actually 
helps defeat the attack. RC5 and RC6 seem to be resistant to this attack, 
as does Mars. But see our analysis of M6 in Section 01 for an example of a 
cipher that uses rotations, additions, and xORs yet still succumbs to mod n 
attacks. 

4. Placing the multiplication or XORing only at the beginning and end of the 
cipher is probably not effective in making the cipher resistant to mod 3 
cryptanalysis, since there are often clever analytical tricks to bypass these 
operations. 

Mixing operations from different algebraic groups was the guiding principle 
behind several ciphers — IDEA |l4VIIVI9 lj . Twofish ISKW-I-9SI . etc. — and that still 
seems like a good idea. 

3.5 RC6 

Note that our mod 3 attack suggests a new design principle for RC6-like ciphers. 
If the / function x ^ x x {2x + I) mod 232 R.C6 had instead been defined 
aszi— ^a;x (2a:-|-I) mod m for some other value of m, powerful mod n attacks 
might be possible. 

For instance, if 3 divides m, then we have a probability-1 approximation for 
the / function. In this case, the RC6 variant obtained by replacing the XORs 
with additions (and also using the modulus m instead of 232 definition 

of /) could be broken by mod 3 cryptanalysis. For example, with the ID FA- like 
modulus m = 2^^ — 1, RC6 would be in serious trouble if we replace the XORs 
by additions. 

This suggests the design principle that gcd(m, 232 — 1 ) = 1 . (The mysterious 
number 232 _ 4 comes from the formulation of rotations as multiplication by 
powers of two modulo 232 — 1 ; see Section im i 

4 M6 

M6 is a family of ciphers proposed for use in the IEEE1394 FireWire standard, 
a peripherals bus for personal computers IE W I pKW‘Jj 0 M6 is used for encryp- 



2 M6 is based on work done in ITHN91I . Note that no full description of M6 is publicly 
available, due to export considerations [Kaz m . However, a general description of the 
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ting copyrighted and other protected content between the computer and the 
peripheral. 

For convenience, we briefly describe an example of a M6 cipher here (speci- 
fically, the example given in |FW2| . an earlier draft of the standard). See also 
Figure 13 for a pictorial illustration. 




Fig. 2. One round of a M6 block cipher 



The cipher uses a 10-round Feistel structure. The / function is defined by 

gi{x) = x®Ki g 2 {y) = (y «< 2) -h y -h 1 mod 2^^ 

gslz) = {z <<$; 8) -I- z mod 2^^ ff4(o) = a + K 2 mod 2^^ 

g 5 {b) = {b <« 14) -h b mod 2^^ f{x) = (55 og^og^og 2 0 gi){x). 

The round function F updates a 64-bit block {x,y) according to 

y)) = {y + fix) mod 2^^, x). 

M6 typically uses 40 bit keys (although the algorithm also allows for keys up 
to 64 bits long), and the key schedule is very simple. Let Ki be the high 32 bits 
of the key, and W be the lower 32 bits of the key (so that iCi and W share 24 
bits in common). Set K 2 = Ki + W mod 2^^. Then Ki,K 2 are the output of the 
key schedule. 

The standard also suggests that other variations on the basic construction 
above can be created by changing the order of the g functions, by swapping 
additions for XORS (or vice versa), and/or by changing the rotation amounts. 
Each round may use a different variation of the basic scheme. As a result, we 
get a family of ciphers, which we will call the M6 ciphers. 

For concreteness, we focus on the example cipher given above. However, we 
note that the same techniques also apply to many other ciphers in the M6 family: 

family of ciphers from which M6 is drawn from can be found in imtiTTi . Our techniqes 
are applicable to most ciphers in this family. 
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as long as the last g function is of the form 6 i— ?> (6 a) + & + /3 mod 2^^ and 
the output of the / function is combined with the block by addition, we will be 
able to apply mod n techniques. Thus, a large fraction of the M6 ciphers can be 
broken by mod n attacks. 

4.1 A mod 5 Attack 

We note that / is highly non-surjective, and in fact admits excellent mod 5 
approximations. In particular, we have the following theorem. 

Theorem 1. f{x) mod 5 S {0,4} for all x. 

Proof: It suffices to show that 55(6) mod 5 G {0,4}. Note that 

£,5(6) = (6 «< 14) + 6 - = (214 _ 232^ 232 _ ^ 

using the relation b 14 = mod 232 _ fj-om Section I'Z. 1 1 It is not hard 
to see that k G {0, 1} (just observe that {b «<C 14) + 5 < 233). 

Note that 5 divides 232 _ so we may reduce both sides of the relation 
modulo 5. Since 2^'^ + 1 = 0 mod 5 and 232 ^ gg|. 

35(6) = —k mod 5, k G {0, 1}. 

This proves our desired result. □ 

We next analyze the round function F using Theorem D The Feistel func- 
tion is combined via addition mod 232, makes things easy. Let {y',x) = 

F{{x, y)), so that y' — y = f{x) mod 232. j^g^j-iting the latter equation to elimi- 
nate the “mod” gives: 

y'-y = /(a;) + 232fc, /jg{_i^o}. 

Reducing both sides modulo five, we get: 

y' — y mod 5 G {0,3,4}. 

It is not hard to see that f{x) mod 5 is uniformly distributed on {0, 4} and k is 
uniformly distributed on {—1, 0}, so we see that y' — y mod 5 = f{x) + k mod 5 
takes on the values 0,3,4 with probabilities 1/4, 1/4, 1/2. 

This can now be applied to the whole cipher. Let be the left half of 
the plaintext and Cl the left half of the ciphertext. We see that Cl — Pl is 
the sum of five independent random variables whose value modulo five has the 
distribution (1/4, 0,0, 1/4, 1/2). Thus the distribution of Cl — Pl mod 5 is the 
five-fold convolution of (1/4, 0,0, 1/4, 1/2), or approximately: 

PCl-Pl mod 5 = (0.248, 0.215, 0.161, 0.161, 0.215). 

The same holds for the right halves. 

So a significant amount of information is leaked from the plaintext through 
to the ciphertext. For instance. Cl mod 5 has a nearly 1/4 chance of being equal 
to Pl mod 5, which is significantly greater than the 1/5 chance one would expect 
from a strong cipher. The bias of the difference is about 0.00577, which indicates 
that with a hundred or so known texts we could easily distinguish the cipher 
from random. 
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4.2 Attacks mod 257 and 1285 

In fact, the cipher is even worse the analysis above might indicate. The Feistel / 
function also admits excellent mod 257 approximations. These approximations 
are easy to use in a key-recovery attack, because they disclose the value of 
K 2 mod 257. 

When we combine the mod 257 and mod 5 attacks, we get an attack mod 
1285 = 5 X 257 which is even better than either one alone. (This observation is 
due to Lars Knudsen [Kn iMlO 

Theorem 2. f{x) - 965 • K 2 mod 1285 G {0, 319, 320, 639, 640, 1284} for all x. 

Proof: Using a similar argument to that found in the proof of Theorem Q we 
find that (73(2:) = 257z — i mod (2^^ — 1), where i G {0, 1} represents a carry bit. 
Also, we have 

fix) = 95ig3iz) + K 2 ) = (2^"^ -h l)(257z - i + K 2 - j) - k mod (2^^ - 1) 

where j, k G {0, 1} are the carry bits in the computation of (74 and (75. Reducing 
both sides modulo 1285 = 5 x 257, we find 

/(x) = 965(A2 — i — j) — k mod 1285, i, j, k G {0, 1}. 

Here we have used the identities 2^^ — 1 = 0 mod 1285, 2^^ 4-1 = 0 mod 5, and 
2^^ 4- 1 = 965 mod 1285. The theorem follows. □ 

Moreover, our simulations show that the distribution of f{x) — 965 A2 mod 
1285 is highly non-uniform. Repeating the analysis given for our mod 5 attack, 
we find that the distribution of = Cl — Pl — 5 ■ 965 • K 2 mod 1285 has bias 
0.0556, so a few dozen known texts should be enough to distinguish the cipher 
from random using a mod 1285 attack. 

In fact, the distribution of Xl has only 59 non-zero entries (with most of the 
probability density concentrated on only a fraction of them), so given one known 
text we can immediately eliminate all but 59 possibilities for K 2 mod 1285 just 
by looking at the left half of the plaintext and ciphertext. The expected number 
of guesses needed to find K 2 mod 1285 from Xl is easily calculated to be about 
9. Then we can recover the remainder of the key with a search over the 2^*^ 
possibilities for K given the guess at K 2 mod 1285. In other words, given one 
known text we can find the key with expected 2^^ offline trial encryptions; this 
is already 16 times faster than exhaustive search. 

If more known texts are available, we can do even better. Each text gives us 
one observation at Xl- If we also look at the right halves of each text, we can 
double the number of available observations. With several dozen known texts, 
we expect to be able to recover K 2 mod 1285 with relatively good accuracy. 
Therefore, when a few dozen known texts are available, we can find the 40-bit 
key with expected 2^® offline trial encryptions. This demonstrates that the level 
of security afforded by the example M6 cipher is extremely low, even for a 40-bit 
cipher. 
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Powerful ciphertext-only attacks may also be mounted if we use just the 
mod 257 attack on its own. Divide the left half of the plaintext into bytes as 
Pl = {w, X, y, z). We find that Pl = z — y + x — w mod 257, so if the bytes of 
the plaintext are biased Pl mod 257 will be too. When the plaintext is ASCII- 
encoded text, we expect very significant biases to remain after encryption, and 
in this case the value of K 2 mod 257 will leak after a sufficient number of cipher- 
texts. (For instance, when the plaintext is composed only of the letters ‘a’-‘z’, 
we have 81 < {Pl mod 257) < 181.) 

M6 could be easily strengthened against mod n attacks with a small change: 
simply always use XOR instead of addition when combining the output of the 
/ function with the block. (It is worth pointing out that no mere change in 
rotation amounts can secure M6 against mod n attacks.) With such a defense, 
some attacks might still be possible given a very large pool of known texts, but 
since the cipher was only designed for a 40-bit keylength, the results might be 
good enough for practical purposes. 

4.3 MX 

We also note that our analysis techniques can be applied to MX mvm\ . 
another cipher with an internal structure similar to that of the example M6 
cipher described above. The primary difference is that MX allows for secret 
round-dependent rotation and addition constants inside the round function. For 
instance, MX’s version of 55 is defined as g 5 {x) = (x S3) -I- x — 7 mod 2^^ 
where 53,7 are fixed round constants (not dependent on the secret key). It is 
not hard to see that has a good mod n approximation no matter the value of 
S3, a, and therefore the MX round function will always be sufficiently biased to 
allow for excellent mod n attacks. 

5 Generalizations, Conclusions, and Open Questions 

In this paper, we have discussed a new cryptanalytic attack that is extremely 
powerful against ciphers based only upon addition and rotation. We have also 
demonstrated an apparent weakness in RC5P, and given a successful attack 
against a substantial fraction of the M6 family of ciphers. 

This shows that the strength of RC5 relies heavily on the mixture of XORS and 
additions — differential cryptanalysis breaks the variant with only xORs |RK98j . 
and we have shown that mod 3 cryptanalysis is a very powerful attack against 
the variant with only additions. We conclude that the mixing of additions and 
XORs in RC5 is not just a nice touch: it is absolutely essential to the security of 
the cipher. 

Note that we can also consider mod p attacks, for any prime p dividing 2^^ — 1. 
The prime factorization of 2^^ — Iis3x5xl7 x 257 x 65537, so there is no 
shortage of potential candidates for p. 

One of the potential problems with using values of p > 3 is that a rotation 
can now involve a multiplication by any value in the set Sp = {2^ modp : j = 
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0, 1, . . .}. Fortunately (for the attacker), when p = 2^ + 1 divides 2^^ — 1 we have 
the nice property that \Sp\ = 2k (since 2^ = —1 mod p and 2^^ = 1 modp), so 
generalized mod p attacks might be even more successful against RC5P than our 
mod 3 attack was. 

Of course, we can also consider mod n attacks where n is not necessarily 
prime. When n is composite, a mod n attack is the rough equivalent of mod 
p cryptanalysis with multiple approximations. If the prime factorization of n is 
Pi X • • • X Pm, then by the Chinese remainder theorem all mod n attacks may be 
decomposed into m attacks modulo each pj . 

A number of open questions remain, which we hope to see investigated in 
the near future. Among them: 

1. Other moduli than 3, 5, and 257 might have some advantages in this kind 
of attack. Section El contains some early work in this direction, but more is 
needed. 

2. Other ciphers that might be vulnerable to mod p attacks. As a rule, ciphers 
that use only addition and rotation are likely to be vulnerable, as are ciphers 
that use addition and some nonlinear operations (such as S-boxes) which 
have a good mod p approximation. 

3. We have made a number of observations in this paper based on experi- 
ments; we would like to find improved mathematical models to explain these 
observations, and solidify our predictions about events we can’t verify expe- 
rimentally. 

4. We suspect that a variant of differential cryptanalysis can be defined for 
some ciphers, using differences mod n instead of mod 2^^. Much of the same 
mathematical apparatus can be used for this class of attack as for our attack. 

5. Data-dependent rotations are handled poorly by the standard analytic tech- 
niques available to us (namely, linear attacks and differential attacks with 
XOR or mod 2^^-based differences). We are interested in other properties 
that can be used in differential- or linear-like attacks, but which will survive 
rotation (and particularly data-dependent rotation) better than fixed XOR 
differences. 
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A The Test 

In this section we study how to distinguish a source with distribution px from 
a source with the uniform distribution pu. The optimal algorithm is the test, 
and we briefly recall its definition, as well as several standard results, here. 

The test allows one to test the hypothesis that the source has distribution 
Pu- Suppose we have n (independent) observations, and let rii denote the number 
of times the source took on the value i. Treating each as a random variable 
(subject to the constraint that = n), the x^ statistic is defined as: 

2/ \ ~ 

* = Y E„„. ■ 

I 

Here E[/ni denotes the expected value of rii under the assumption that the 
source has distribution pu- It is not hard to see that = n/k, so the 

statistic is just kjn'Y^^irii — n/k)"^. In the x^ test, we compare the observed x^ 
statistic to xi fc-i (the threshold for the test with fc — 1 degrees of freedom 
and with significance level a). 

We can easily compute the expected value of the X 2 statistic. 

Theorem 3. Exx'^ini, ■ ■ ■ ,nk) = nk\\px - PuW^ + k - kWpxW^ ■ 

Corollary 1. E[/x^ = k — 1. 

We can see that if n = c/\\px—Pu\\^i then ExX^ = ck+k — kWpxW^- Since we 
will usually be interested in the case where px ~ Pu, we find ExX^ ~ (c-l-l)A:— 1. 
Thus ExX^ differs from E[/x^ by a significant amount when c = 12(1). 

In summary, we can conclude that n = 0(l/||px ~Pu\\'^) observations suffice 
to distinguish a source with distribution px from a source with distribution pu- 
This shows that our definition of the bias of px as ||px —PuW^ was well-chosen. 
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Abstract. This paper describes a new differential-style attack, which 
we call the boomerang attack. This attack has several interesting ap- 
plications. First, we disprove the oft-repeated claim that eliminating 
all high-probability differentials for the whole cipher is sufficient to gu- 
arantee security against differential attacks. Second, we show how to 
break COCONUT98, a cipher designed using decorrelation techniques to 
ensure provable security against differential attacks, with an advanced 
differential-style attack that needs just 2^® adaptively chosen texts. Also, 
to illustrate the power of boomerang techniques, we give new attacks on 
Khufu-16, FEAL-6, and 16 rounds of CAST-256. 



1 Introduction 

One of the most powerful cryptanalytic techniques known in the open literature 
is differential cryptanalysis ESMI- Differential analysis has been used to break 
many published ciphers. It is understandable, then, that block cipher designers 
are typically quite anxious to ensure security against differential style attacks. 

The usual design procedure goes something like this. The algorithm designer 
obtains somehow an upper bound p on the probability of any differential charac- 
teristic for the cipher. Then the designer invokes an oft-repeated “folk theorem” 
to justify that any successful differential attack will require at least 1 /p texts to 
break the cipher, which is supposed to allow us to conclude that the cipher is 
safe from differential attacks. 

Unfortunately, this folk theorem is wrong. We exhibit an attack — which we 
call the boomerang attack — that can allow an adversary to beat the 1 /p bound in 
some casefl In particular, if the best characteristic for half of the rounds of the 
cipher has probability q, then the boomerang attack can be used in a successful 
attack needing chosen texts. In some cases, we may have q~'^^ p~^, in 

which case the boomerang attack allows one to beat the folk theorem’s bound. 
Also, boomerang attacks sometimes allow for a more extensive use of structures 
than is available in conventional differential attacks, which makes boomerang 
techniques more effective than the preceding discussion might suggest. 

^ Note that Biham et a/.’s impossible differentials jBBS98‘B.BK59j also disprove the 
folk theorem. They show that if one can find a differential of sufficiently low proba- 
bility, the cipher can be broken. However, the boomerang attack in fact lets us make 
an sharper statement: even if no differential for the whole cipher has probability 
that is too high or too low, the cipher might still be vulnerable to differential-style 
attacks. 
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Cipher 


(Rounds) 


Our Attack 






Data Complexity Time Complexity 


COCONUT98 


(8) 


2^® CP 


Khufu 


(16) 


218 (^p 


CAST-256 


(16) 


249.3 249-3 


FEAL 


(6) 


4 CP 



KP — known-plaintext, CP — adaptive chosen-plaintext /ciphertext. 
Table 1. Summary of our attacks. 



We give a surprisingly sharp example of this possibility in Sections00 below, 
where we show how to break COCONUT98 |V98) with just 2^® chosen texts and 
2®® work, despite a proof that the best characteristic for the whole cipher must 
have probability p ~ 2“®^. Our attack makes crucial use of a characteristic for 
half of the cipher with probability q Ri 2“^. This shows that the folk theorem 
can fail spectacularly, even for real-world ciphers. 

We also extend the boomerang attack to use techniques from truncated dif- 
ferential analysis (see Section EJ. As a result, we are able to analyze ciphers 
which admit good truncated differentials. In Section 0 we show how to break 
16 rounds of Khufu with 2^® adaptive chosen plaintexts and ciphertexts and 
very little work. We also consider CAST-256 in Section 0 where we show how 
to break 16 rounds with 2'*® ® known text^. Section 0 also briefly sketches the 
inside-out attack, a dual to the boomerang attack. Finally, Section 1701 discusses 
some related work, and Section ^concludes the paper. See Tabled for our table 
of results. 

2 The Boomerang Attack: A Generic View 

The boomerang attack is a differential attack that attempts to generate a quartet 
structure at an intermediate value halfway through the cipher. 

The attack considers four plaintexts P, P\ Q, Q' , along with their respective 
ciphertexts C,C',D,D'] we will defer describing how these are generated until 
later. Let E{-) represent the encryption operation, and decompose the cipher into 
E = El o Eq, where Eq represents the first half of the cipher and Ei represents 
the last half. We will use a differential characteristic, call it Z\ — >■ Z\*, for Eq, as 
well as a characteristic V — ?► V* for E^^ . 

We want to cover the pair P,P' with the characteristic for Eq, and to cover 
the pairs P,Q and P',Q' with the characteristic for E^^ . Then (we claim) the 
pair Q, Q' is perfectly set up to use the characteristic Z\* — >• Z\ for Eq^. 

^ See also Appendix where we show that CAST-256 would be much weaker if 
the round ordering was reversed: in particular, boomerang attacks would be able 
to break 24 rounds of this variant with 2'*® ® chosen texts. Please note that this 
24-round boomerang attack does not apply to the real CAST-256 AES proposal. 
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Let’s examine why this is so. Consider the intermediate value after half of 
the rounds. When the previous three characteristics hold, we have 

Eo{Q) © Eq{Q') = Eo{P) © Eo{P') © Eo{P) © Eo{Q) © Eo{P') © Eo{Q') 

= Eo{P) © Eo(P') © E^\C) © E^\D) © E^\C') © E^\D') 
= Z\*©V*©V* = Z\*, 



Note that this is exactly the condition required to start the characteristic A* — >■ 
A for the inverse of the first half of the cipher. When this characteristic also 
holds, we will have the same difference in the plaintexts Q, Q' as found in the 
original plaintexts P, P' . This is why we call it the boomerang attack: when you 
send it properly, it always comes back to you. 



Q Q' 




C C 



Fig. 1. A schematic of the basic boomerang attack. 



We define a right quartet as one where all four characteristics hold simulta- 
neously. The only remaining issue is how to choose the texts so they have the 
right differences. We suggest generating P' = P (B A, and getting the encrypti- 
ons C, C' of P, P' with two chosen-plaintext queries. Then we generate D,D' as 
H = C © V and D' = C ® V. Finally we decrypt D, D' to obtain the plaintexts 
Q, Q' with two adaptive chosen-ciphertext queries. See Figure n for a pictorial 
depiction of the basic boomerang attack. 
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In the remainder of the paper, we consider several concrete attacks using the 
boomerang attack. 



3 The COCONUT98 Algorithm 

The COCONUT98 cipher |2M| may be of special interest to some readers be- 
cause of its reliance on the recently-developed theory of decorrelation techniques 
for block cipher design fV97IV98IV98hlC(;-k98l . Using decorrelation techniques, 
[IV98j proves that the full COCONUT98 cipher admits no good differential cha- 
racteristics. Despite this fact, we observe that there are differential characteri- 
stics of very high probability for half of the cipher, and we make extensive use of 
these characteristics in our attack. This suggests that the decorrelation design 
technique may fail to provide security against advanced differential attacks in 
some cases if extra care is not taken. This is not to suggest that the decorrela- 
tion approach is fundamentally flawed — indeed, decorrelation theory seems like 
a very useful tool for the cipher designer — but rather that the theoretical results 
must be interpreted with caution. 

We briefly recount the description of the COCONUT98 algorithm. COCO- 
NUT98 uses a 256-bit key K = {Ki , . . . , K^). The key schedule generates eight 
round subkeys fci , . . . , fcg as 



i 


1 


2 


3 


4 


ki 


Ki 


Ki(BK3 


KiOKs® Ki 


Ki®Ki 


i 


5 


6 


7 


8 


ki 


K 2 


K 2 © 


AT2 © STa © Ki 


K 2 © Ki 



The last four key words are used to build a decorrelation module 

M{xy) = {xy © K^Kq) x KtKs mod GF(2®^) 

where concatenation of symbols (e.g. xy) represents the concatenation of their 
values as bitstrings. 

Next, we build a Feistel network as follows. Let 

4>{x) = X + 256 • S{x mod 256) mod 2^^ 

Fi{{x,y)) = {y,x® (j){ROLii{(l){y ® ki)) + c mod 2 ^^)) 

'I'i = T4i+4 O Fii+^ O F4i+2 O ^4^+1 



where ROLn{-) represents a left rotation by 11 bits, c is a public 32-bit constant, 
and S' : Z® — t is a flxed S-box. 

With this notation, COCONUT98 is defined as iFi o M o ipQ . In other words, 
COCONUT98 consists of four Feistel rounds with subkeys fci, . . . , ^ 4 , followed 
by an evaluation of the decorrelation module M, and Anally four more Feistel 
rounds with subkeys k^, ... ,ks- 
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4 Differential Characteristics for COCONUT98 

This section discusses the differential characteristics of COCONUT98. In the 
following discussion, let ej = 2-1 be the 32-bit XOR difference with just the j-th 
bit flipped. (Subscripts are taken modulo 32, for convenience in modeling the 
ROL{-, 11) operation.) 

We note that the Feistel rounds of COCONUT98 admit very good differential 
characteristics. The main observation is that Cj — >■ ej+n by the Feistel function 
with probability 1/2 when j G J = {8, 9, . . . , 19, 20, 29, 30, 31}H Similarly, ej © 
Cfc — >■ Gj+ii © Gfe+ii with probability 1/4 when j,k G J {j ^ k). 

Using this idea, we can build many good characteristics for four rounds of 
COCONUT98. For example, the characteristic 

(ei 9 , ei8 © eg) — >■ (eig © eg, 629) — >■ (629, eig) — >■ (eig, 0) — >■ (0, Cig) 

for W has probability 0.83 • 2“^ Ri Of course, by symmetry we also get cor- 

responding backwards characteristics for decryption through four Feistel rounds. 

This suggests that we ought to try to find some way to take advantage of 
these high-probability characteristics for the half-cipher in our analysis. Howe- 
ver, the task is not so easy as it might first look. If we try to mount a traditional 
differential attack on the whole cipher, the decorrelation module M will imme- 
diately cause serious difficulties. When the key words Ky, are unknown, it is 
very difficult to push any differential characteristic through M . More precisely, 
every differential 5^5* for M with 5,5* has average probability 1/(2®"'^ — !), 
where the probability is averaged over all possible key values. In short, the de- 
correlation module prevents us from pushing a differential characteristic past M . 

This is where the boomerang attack comes in handy: the boomerang quartet 
property allows us to control the effect of the decorrelation module in the middle. 

The crucial idea which lets the attack work is that M is affine, and thus for 
any fixed key there are excellent characteristics V* — >■ M“^(V*) of probability 1 
for M~^ . Take Eq — Wq and Ei — EioM. Then if V — V* is a good characteristic 
for we will obtain a good characteristic V — M“^(V*) for E^^ . It does not 
matter that is unknown to the attacker; the crucial property is that it 

depends only on the key (and not on the values of the ciphertexts). 

Let us estimate the success probability for this technique. We need two cha- 
racteristics for Eq, and two for to hold. Thus, a simple estimate at the 
probability p of success is 

p > Ft[A A* by Pr[V ^ V* by 

® At first glance, it might appear that the probability is 1/8, because there are three 
additions in the F function and thus three carry bits to control. However, the three 
carries are not independent, and in fact we can handle three carries as easily as 
one by noting that ® 1 — {x -\- a mod 2®^) + h mod 2®^ (two carries) is equivalent to 
X X + c mod 2®^ (one carry) where c = a + h. 

The rotate does not destroy this property, so long as we avoid the most significant 
bits, which explains our choice of J. Empirically, the probabilities are 0.47, 0.44, 0.38 
for j — 18, 19, 20 and 0.47, 0.44 for j = 29, 30. For other values of j, the probability 
is very close to 1/2. 
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where Z\, A* , V, V* may be chosen arbitrarily by the attacker to maximize p. 

It turns out that this estimate can be refined a bit. We note that the same 
attack works even if we do not predict the exact value of V* ahead of time, but 
instead merely require that the difference after decrypting by is the same in 
the two pairs P, Q and P' ,Q' . A similar observation also holds for A* . Therefore, 
we may sum over all values for Z\*, V*, to obtain 

p « ^ Ft[A A* by <Pof • ^ by 

A* V 

For COCONUT98, this can be used to significantly increase the probability of 
attack. Empirically, we find that A — V= (eio, 631) provides p « 0.023 • 0.023 « 
1/1900. 

5 The Basic Boomerang Attack on COCONUT98 

Next we show how to use the quartet property established above to mount a 
practical attack on COCONUT98. We use a 1-R attack, so the criterion for 
success is that Q (B Q' = (?, 631) where ? represents an arbitrary word. This 
improves the success probability p by a factor of two, to 1/950. 

It is immediately clear from this discussion that COCONUT98 can be easily 
distinguished from an ideal cipher with at most about 950 • 4 = 3800 adaptive 
chosen plaintext/ciphertext queries. However, we aim for more: a key-recovery 
attack. 

The key-recovery attack proceeds along relatively standard lines. In about 
16 • 950 trials requiring 16 • 950 • 4 adaptive chosen plaintext /ciphertext queries, 
we generate about 16 useful quartets. Note that the signal-to-noise is extremely 
high, so we should be able to filter out all wrong quartets very effectively. 

First, we recover Ki. We guess Ki, and peel off the first round. We use 
the fact that if P,P',Q,Q' form a quartet with the property above, then the 
XOR difference after one round of encryption must be (631, 0) for both the P, P' 
pair and the Q,Q' pair. This condition holds for 1/2 of the wrong key values. 
Therefore each quartet gives one bit of information on Ki from the P, P' pair 
and another bit of information from the Q, Q' pair. With 16 useful quartets, we 
expect Ki to be identified uniquely. 

Next, we recover K 2 ®Ki by decrypting up one round and examining the xOR 
difference in the C, D pair and in the C , D' pair. The details are very similar to 
those used to learn Ki. 

This allows us to peel off the first and last rounds of the cipher. Then we 
repeat the attack on the reduced cipher. For instance, we can use about 8- 144-4 
more adaptive chosen plaintext/ciphertext queries to generate about 8 useful 
quartets for the reduced cipher if we use the same settings for A, V, since then 
the success probability p increases to about 1/144. Using these 8 useful quartets 
for the reduced cipher we learn K 3 ; and we repeat the attack iteratively until 
the entire key is known. 
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In all, the complexity of the attack is about 16 • 950 • 4 + 8 • 144 • 4 + . . . Ri 2^® 
adaptive chosen plaintext/ciphertext queries. The attack requires 8 • 2 • 32 • 2^^ = 
2^^ offline computations of the F function, which is work comparable to that 
required for 2^® trial encryptions. The attack can also be converted to a known- 
plaintext attack, but then the complexity increases dramatically to 2®^ texts. 

The best conventional attack on COCONUT98 we could find was a meet- 
in-the-middle attack that exploits a weakness in the key schedule. However, the 
meet-in-the-middle attack requires approximately 2®® trial encryptions, so our 
chosen-text boomerang attack compares very favorably to it. See Appendix E 
for more details on the meet-in-the-middle attack. 

Fixing the cipher would require careful changes to its internal design. One 
possible approach would be to replace the four-round Feistel network by a 
transformation with much more strength against differential cryptanalysis (say, 
16 rounds instead of 4). Another possible approach is to use a decorrelation 
module in each round; this seems likely to prevent boomerang-style attacks, and 
is in fact the approach proposed in the DFC AES submission inn+n- (Using 
just a decorrelation module before the first round and after the last round is not 
enough — differential-style attacks are still possible.) 

It is clear that the mere use of decorrelation techniques is not enough to 
guarantee security against differential-style attacks. At the same time, although 
it does not provide the conjectured 2®'^ security level, COCONUT98’s decorrela- 
tion module does seem to improve the cipher’s security. Without a decorrelation 
module, COCONUT98 would be vulnerable to conventional differential attacks 
requiring on the order of 2® chosen texts, so in this case the decorrelation module 
seems to have approximately squared the security level of the base cipher. 

6 Extensions to Trnncated Differential Analysis 

So far we have confined the discussion to conventional differential characteristics, 
but it seems natural to wonder whether boomerang attacks can also be made 
to work using truncated differentials. The answer is yes, but there are some 
difficulties. 

The pitfall with extensions to truncated differentials is that 

Pr[Z\ ^ A* by F] = Pr[A* ^ A by F~^] 

always holds for conventional differential characteristics, but can fail to hold 
for truncated characteristics. Note that our analysis in earlier sections assumed 
that if A — A* by the first half of the cipher, then A* — >■ A holds with the 
same probability for the inverse of the first half of the cipher. For truncated 
differentials, this assumption in general is not correct. 

A more accurate formula for the success probability p of a boomerang attack 
with truncated differentials is 

p Ki ^ Pr[A — >• re by Eq] x Pr[V — ^ x by x 

Pr[V y hy E^^] x Pr[z — A by Eq^]. 
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This formula is rather unwieldy, but fortunately it can often be simplified sub- 
stantially to 

p Ri Pr[Z\ ^ Z\*] X Pr[V ^ V*]^ x Pr[Z\* ^ Z\] x 
Pr[?r; (B x (B y € A* \ w € A*,x, y S V*]. 

If the truncated differentials A*,V* are linear (i.e. closed under ©), as is usually 
the case, the last term in the formula above is easily computed. 

7 Khufu 

We describe a boomerang attack that breaks 16 rounds of Khufu |Mer9()| with 
2^® adaptively chosen plaintext/ciphertext queries and a comparable workfac- 
tor. This is an improvement over the best previous result, a differential attack 
on Khufu-16 needing chosen texts (depending on whether one wants a 

distinguishing or key-recovery attack) IHTMI . 

In our boomerang attack, we exploit that there are excellent truncated dif- 
ferentials available for both halves of the cipher. For the first half of the cipher, 
we use 

A = (0, 0, 0, a, b, c, d, e) — ?> (0, 0, 0, a, 0, 0, 0, 0) = Z\*, 

which holds with probability 2“®^ in the forward direction and probability 
1 in the reverse direction. We will hold a fixed throughout the attack. For 
the inverse of the last half of the cipher, we use V = (0, 0, 0, a, 0, 0, 0, 0) — )> 
{0,0,0,a, f,g,h,i) — V*, which holds with probability 1. Also, due to a careful 
choice of V*, Z\*, we have Pr[w©a:©y G Z\* | w G A* ,x, y G V*] = 1. Thus 2“®^ 
of the quartets chosen according to these differences will form right quartets. 

One can use structures to reduce the number of texts needed. Choose a pool 
of 2^® plaintexts (L,Ri) with L held fixed and Ri varying. Also, form another 
pool of 2^® plaintexts as (L',R'j) where L' = L © (0,0,0, a) and i?' varies. For 
each ciphertext C obtained by encrypting one of these 2^^ plaintexts, we decrypt 
D = C©V to get the plaintext Q. We look for Q, Q' with a difference of (0, 0, 0, a) 
in the left half of the block; such a pair probably indicates a right quartet. This 
choice of structures is expected to provide about one right quartet, although one 
wrong quartet will probably also survive the initial filtering phase. 

Once we have a (suggested) right quartet formed by (L,Ri) and (L',Rj), 
we can use it to obtain more right quartets at little cost. We form another 
2^° quartets by choosing P — (L (B {a, P,0,0), Ri), P' = {V © (a, /3, 0, 0), i?' ) 
where a, (3 take on 2^° possible values; C,C ,D,D' ,Q,Q' are generated from 
P, P' as before. Now each such quartet is guaranteed to be a right quartet (if 
{L,Ri), (L',Rj) formed a right quartet) because we have successfully bypassed 
the first round. Thus, any wrong quartets which survived the earlier filtering 
phase are easily eliminated. Furthermore, given 2^^ right quartets we expect to 
be able to form 2^^ equations of the form 51 ( 2 ;) © S'i(y) = 2 : for known values 
of x,y,z, and this should be sufficient to recover up to a XOR by a 32-bit 
constant. Then the 8-round reduced cipher can be broken trivially. 
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In total, this attack on Khufu-16 requires 2^® + 4 x fn 2^® adaptively 
chosen texts. The workfactor is minimal. 



8 FEAL 

One can also apply boomerang techniques to FEAL. There are 3-round diffe- 
rential characteristics with probability one jT3S93j . so we immediately obtain an 
efficient boomerang attack that distinguishes FEAL-6 from a random permuta- 
tion with only four adaptive chosen plaintext/ciphertext queries. (This elegant 
observation is due to Eli Biham [Bih99j .l 

9 Inside-Out Attacks 

In this section, we sketch a description of the “inside-out attack,” which may be 
viewed as a dual to the boomerang attack. The difference is that the boomerang 
attack works from the outside in while the inside-out attack works from the 
inside out. 

In the inside-out attack, we search for pairs of texts which contain a desired 
difference A at the intermediate value after half the rounds. We hope that the 
differential A ^ A' for E\ and the differential Z\ — >■ Z\* for both hold. 
In this case, we will have recognizable differences A* and A' in the plaintexts 
and ciphertexts of the pair. If we accumulate enough pairs with the difference 
A halfway through the cipher, we should be able to find at least one right pair 
where both differentials hold. 

To illustrate these ideas in action, we analyze 16 rounds of CAST-256. CAST- 
256 |Ada98| is a generalized Feistel block cipher, whose simplicity makes it a nice 
test-bed to explore the properties of generalized Feistel round structures. 

We briefly recall the definition of CAST-256 here. The 128-bit block is divided 
into four 32-bit words, and a Feistel function F : ^ is used to update 

the block. There are two types of rounds, which we shall call “A rounds” and “B 
rounds” in a choice of terminology inspired by Skipjack. An A round encrypts 
the input block (w,x,y,z) to {z,w,x,y (B F{z)), and a B round encrypts to 
{x,y,z © F{w),w). Note that A fv by this we mean that the structure of 
the inverse of a B round is the same as the structure of an A round, not that 
they are true functional inverses. With this terminology, the CAST-256 cipher 
structure is defined as o i.e. 24 A rounds followed by 24 B rounds. 

The CAST-256 structure admits many nice truncated differentials. In our 
boomerang attack, we will use A = (0,0,0, a) — >■ (0, 6,c, a) = A', which holds 
with probability 1 for 8 B rounds, and A = (0, 0, 0, a) — >■ (0, d, e, a), which holds 
with probability 1 for decrypting though 8 A rounds. 

The signal-to-noise ratio of the inside-out attack will be reasonably good, 
because right pairs can be recognized by a 96-bit filtering condition. 

To implement the attack, we collect 2'*® ® known texts encrypted under 16 
rounds of CAST-256. By the birthday paradox, we expect to see three right pairs 
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among those texts, which can be readily recognized. (We also expect to get three 
wrong pairs, but they should be eliminated in the next phase.) Then we search 
over the last round subkey. Each guess at the 37 key bits entering the last round 
suggests 2® possible values for the 37 key bits entering the next-to-last round; 
the three right pairs allow us to uniquely recognize the correct values for the last 
two round subkeys. The first two round subkeys can be recovered by analogous 
techniques. Finally, the attack may be repeated on the reduced-round cipher. 

To sum up, we see how to break 16 rounds of CAST-256 with an inside-out 
attack that needs just 2“^®-^ known texts and very little work. This attack is 
independent of the definition of F function or key schedule, and depends only 
on the round structure. 

There are two implications of our analysis. First, it indicates that CAST-256 
reduced to 16 rounds would not be adequately secure. Since CAST-256 with 48 
rounds is 2-2.5 times slower on high-end CPUs than the fastest AES candidates 
mm, this suggests that CAST-256 ’s security-to-performance ratio may not 
be as high as some other contenders. On the other hand, security clearly must 
take precedence over performance, and here our analysis provides some support 
for the CAST-256 design. We have seen that CAST-256’s round ordering is 
ideally-suited to resist boomerang attacks (see Appendix [EJ, and due to the 
sheer number of rounds, it seems very hard to extend our inside-out attack to 
the full cipher. 



10 Related Work 

The boomerang attack is closely related to many other ideas that have pre- 
viously occurred in the literature. As a result, there are many different ways to 
think about the boomerang attack. In this section, we will try to survey the 
possibilities. 

The boomerang attack is related to the differential-linear attack of pL^ . In 
a differential-linear attack, one covers Eq with a truncated differential A ^ A*, 
covers with a linear approximation T — >■ T*, and finally covers E\ with 
a second approximation T* — >■ T; there is also the additional requirement that 
T* • a; be constant for all x G A*. From this perspective, one could think of the 
boomerang attack as a “differential-differential” attack (if the reader will indulge 
a slight abuse of terminology) . 

A similar observation is that the boomerang attack is closely related to 
higher-order differential techniques jlja,i!I4IKnuh5j . As noted in Section El the 
pairs P,Q, P',Q' don’t actually need to follow V — >■ V*: it is sufficient that 
E^^{P) ©Ff^(P') ©£’j~^(Q) (B E^^{Q') = 0, and this may be viewed (in a very 
approximate sense) as a higher-order differential of order two. In this way, the 
boomerang attack can be considered as an intermediate step between conventio- 
nal differential and higher-order differential attacks. 

Another precursor of the boomerang attack is the “double-swiping” attack 
[IKSW97j . a differential related-key attack on NewDES-1996 that can, in retro- 
spect, be viewed as a boomerang-style attack (with minor adjustments to take 
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advantages of related-key queries, as allowed in fKSWQTj ’s extended threat mo- 
del). 

One of the interesting features of the boomerang attack is that it is apparent! 
very well-suited to the analysis of ciphers that use asymmetric round function: 
Asymmetric round functions can be classified into one of two types: the A round, 
which has better diffusion in the forward direction than in the reverse direction, 
and the B round, which has better diffusion in the reverse direction. We note 
that when the first half of the cipher is built of B-type rounds and the last half 
is built of A-type rounds, boomerang attacks seem to be especially dangerous 
because they allow one to probe from both endpoints at the same time. 

This supplies some intuition for how the boomerang attack works. It would 
not be unreasonable to think of the boomerang attack as a differential meet- 
in-the-middle attack that uses differentials to work from the outside in; the 
interesting bit is what happens where the differentials “meet” in the middle of 
the cipher. 

One disadvantage of the boomerang attack is that it inherently requires 
the ability to perform both adaptive chosen-plaintext and adaptive chosen- 
ciphertext queries at once, a rare requirement to find in a practical attack. We 
are aware of only two other attacks with this property: (1) the adaptive chosen- 
plaintext/ciphertext attack on the 3-round Luby-Rackoff cipher, which is also 
used to good effect in some of Knudsen’s work IKnuhHI on Luby-Rackoff ciphers 
with more rounds, and (2) Biham et. aFs yo-yo game IBB-l-f)8l . which is closely 
related to their more-famous miss-in-the-middle attack [KKSh8fHHS9h| . 

The relation between the boomerang attack and the miss-in-the-middle attack 
is a close and interesting one. It seems that the boomerang attack is little more 
than a chosen-plaintext /ciphertext version of the miss-in-the-middle attack. In 
particular, if Pr[Z\ — >■ A*] = Pr[V — >■ V*] = 1 and A* fl V* = 0, then the same 
pair of differentials can be used to obtain either a miss-in-the-middle attack 
(using the impossible differential Z\ — >■ V) or a boomerang attack. 

This paper showed that in some special cases the boomerang attack can im- 
prove on the miss-in-the-middle attack, if adaptive chosen plaintext /ciphertext 
queries are available. However, this seems to be the exception rather than the 
rule. For several ciphers — including Skipjack and CAST-256 — miss-in-the-middle 
attacks penetrate through more rounds than boomerang attacks [BBS981BBS99j . 
Though a thorough comparison of the two types of attacks continues to elude 
us, we hope that this work will stimulate further research into the interaction 
between these two attacks. 



11 Conclusions 

We have described a new way to use differential-style techniques for cryptanalysis 
of block ciphers. Our attacks can break some ciphers that are immune to ordinary 
differential cryptanalysis, and can provide a powerful new way to analyze ciphers 



See AppendixElfor a concrete example of this. 




The Boomerang Attack 167 



with asymmetrical round structures. To protect against these attacks, cipher 
designers should ensure that there are no good differentials for the first or last 
half of their cipher. 
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A Meet-in-the-Middle Attack on COCONUT98 

The very simple key schedule used in COCONUT98 exposes it to meet-in-the- 
middle attacks. The problem is that there are only 96 bits of entropy in the first 
four round subkeys, and a similar property holds for the last four round subkeys. 
Therefore, with just four known texts and about 2®® offline work, one can break 
COCONUT98 using standard meet-in-the-middle technique^. The workfactor 
of this attack is disappointingly low for a cipher with a 256-bit key. 

When the key is chosen non-uniformly, e.g. from a passphrase, this attack 
can be even more deadly. If we assume a key entropy of 4 bits/byte (probably 
a gross overestimate for most passphrases), the workfactor of the meet-in-the- 
middle attack can be reduced to approximately 2“^® trial encryptions. This is 
much faster than exhaustive keysearch. 

B A CAST-256 Variant 

In this section, we consider a simple CAST-256 variant obtained by exchanging 
the order of the A rounds and the B rounds. (In other words, the variant cipher 

® Specifically: Obtain four known text pairs Pj, Cj for j = 1, 2, 3, 4. Guess K3, K4. For 
each possibility for K\, store {'To{Po) — ToiPi)) / {'Eo{P2) — 'To{Pi)),Ki in a lookup 
table keyed on the first component. Finally, for each possibility for K 2 , we compute 
{Cl)) / (C2) — ^ 4 ^{Ci)) and look for a match in the lookup 

table. 
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uses the B rounds first.) The primary contribution is that such a variant can be 
readily analyzed using boomerang attacks. 

Please note that this attack does not apply to CAST-256 (only to a variant 
with a different round structure^ Since the designers of CAST-256 already 
knew of the need to apply the A rounds first rMm\ . we feel that the variant 
does injustice to the spirit of the CAST-256 design. We focus on CAST-256 
primarily because it makes such a simple, clean platform for analysis of novel 
round structures. We believe our attack on this CAST-256 variant gives new 
insights into the properties of various ciphers with generalized Feistel round 
structures fNSA98IB( J-l-9iSKll A ;9SIYuv97IIN W97|Saa9iS) . so we hope the analysis 
is of independent interest. 

The sheer number of rounds makes it hard to mount good attacks on the 
full 48-round CAST-256. In this section, we show that boomerang attacks with 
complexity are possible on 24-25 rounds of the variant cipher. These 

attacks do not appear to extend to the original CAST-256 round ordering, so we 
believe this provides some additional justification that CAST-256 is using the 
right round ordering. 

A SIMPLE ATTACK ON 24 ROUNDS. We use the truncated differential A = 
(0, 0, 0, a) — >■ {b, c, d, a) = A* for the 12 B rounds (where a may take on any non- 
zero value, and 6, c, d are arbitrary). For the inverse of the last half of the cipher, 
we use a similar truncated differential: namely, V = (0, 0, 0, e) — >■ (/, g, /i, e) = 

V. 

Using the machinery developed in Section 0 the computation of the suc- 
cess probability is straightforward. Both of these truncated differentials have 
probability 1, and A* — >■ A has probability 2“®®. Finally, we note that 



Pr[w ®x®y e A* \ w & A*,x,y eV*]= Pr[o ©e©eV0] = l- 2"^^, 

so the overall success probability is p ~ 2“®®. 

We start the attack by choosing 2^^ plaintexts Pi where the first three words 
are held fixed and the last takes on all 2^^ possibilities, and we obtain the 
corresponding ciphertexts Ci. For each such ciphertext Ci, we generate 2^® ® 
new ciphertexts Di j by varying the final word. Then we decrypt each Di j to 
obtain the corresponding plaintext Qij- This gives us 2®^ choices for P, P' from 
the pool of plaintexts Pi and another 2^^ choices for D, D' from the Dij. In all, 
there will be 2®® possible quartets to choose from. About p ~ 2“®® of them will 
form right quartets, so we expect to see one right quartet. The excellent filtering 
available (we can filter on all 128 bits of Q © Q') allows us to eliminate all the 
wrong quartets with high probability. 

This immediately gives a way to distinguish the 24-round CAST-256 variant 
from a ideal cipher with 2^®'® adaptively chosen texts and a low workfactor. 

A KEY-RECOVERY ATTACK ON 25 ROUNDS. The same ideas can also be used 
to develop key recovery attacks. For instance, we can break the 25-round variant 
obtained by prepending one more B round at the beginning with 2®® chosen 

® See also Sectional which analyzes 16 rounds of the real CAST-256 cipher (without 
any re-ordering of the rounds). 
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texts and a similar amount of work. Due to lack of space, we give only a very 
brief sketch of the attack: we bypass the first round with structures, and then in 
the analysis phase we guess the first-round subkey, peel off the first round, and 
check for the existence of right quartets. 

Discussion. It is worth comparing our results to what is attainable with 
conventional truncated differential cryptanalysis. In the case of this CAST-256 
variant, boomerang attacks seem to compare favorably for up to 24 rounds, due 
to the asymmetric round structure, but for more than 25 rounds conventional 
techniques are at least as good as the boomerang. 

The astute reader will have noticed that our truncated differentials A —>■ 
A* (for the 12 B rounds) and V* — >■ V (for the 12 A rounds) can be readily 
concatenated to obtain a truncated differential A = (0, 0, 0, a) —>■ (0, 0, 0, a) = V 
of probability 2“®® for the entire cipher. The resulting 24-round differential will 
have probability 2“®®, and can be used in a conventional truncated differential 
attack that distinguishes the 24-round CAST-256 variant from a ideal cipher 
with 2®® (non-adaptive) chosen plaintexts. Note that you can also get a miss- 
in-the-middle attack on the 24-round variant with the same techniques, since 
(0,0,0, a) — >■ (0,0,0, o') is an impossible differential when a yf a'. This gives an 
attack that uses 2®® chosen plaintexts and not much work. 

Thus, our 24-round boomerang attack (2^® ® adaptive chosen-plaintext and 
chosen-ciphertext queries) seems better than the conventional truncated differen- 
tial attack (2®® chosen plaintexts) or the miss-in-the-middle attack (2®® chosen 
plaintexts), but it loses its advantage at 25 rounds. 

One reason why the boomerang attack succeeds against the CAST-256 va- 
riant is that CAST-256 rounds exhibit a definite asymmetry. In both Skipjack 
and CAST-256, the A rounds have weaker diffusion in the reverse direction than 
in the forward direction, while the B rounds are stronger in the reverse direc- 
tion. Thus, the combination of A and B rounds makes conventional differential 
attacks harder than usual: whether we attack the cipher or the inverse cipher, 
we will have to push a differential through 12 “strong rounds”. In contrast, the 
boomerang attack allows us to follow the path of least resistance in both direc- 
tions, because we cover the B rounds with a differential running in the forward 
direction and cover the A rounds with a differential running in the reverse di- 
rection. This makes the boomerang attack especially well-suited to the analysis 
of a cascade of B rounds followed by A rounds. 

By the same line of reasoning, boomerang techniques would be especially 
weak at analyzing the real CAST-256 cipher, where the A rounds precede the B 
rounds. A boomerang attack on CAST-256 would be attacking the cipher at all 
of its strongest points, and thus boomerang techniques would be a particularly 
poor tool for analyzing the real CAST-256 round structure. 
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Abstract. We provide new constructions for Luby-Rackoff block ciphers 
which are efficient in terms of computations and key material used. Next, 
we show that we can make some security guarantees for Luby-Rackoff 
block ciphers under much weaker and more practical assumptions about 
the underlying function; namely, that the underlying function is a se- 
cure Message Authentication Code. Finally, we provide a SHA-1 based 
example block cipher called Sha-zam. 



1 Introduction 

The design of block ciphers whose security provably relies on a hard underly- 
ing primitive has been a popular area of contemporary cryptographic research. 
The path breaking paper of Luby and Rackoff [7] described the construction 
of pseudorandom permutation generators from pseudorandom function genera- 
tors, which enabled the formalism of the notion of a block cipher. This theoretical 
breakthrough has stimulated a lot of research and ciphers based on this principle 
are often termed Luby-Rackoff Ciphers. Recall that block ciphers are private-key 
cryptosystems with the property that the length of the plaintext and ciphertext 
blocks are the same. Pseudorandom permutations can then be interpreted as 
block ciphers that are secure against adaptive chosen plaintext and ciphertext 
attacks. These permutations are closely related to the concept of pseudoran- 
dom functions which was defined by Goldreich, Goldwasser and Micali (GGM) 
[5] . These are functions which are “indistinguishable” from random functions in 
polynomial time. The GGM construction relied on the notion of pseudorandom 
bit generators, i.e., bit generators whose output cannot be distinguished from a 
sequence of random bits by any efficient method. 

* Work done while this author was at Lucent Technologies 
L. Knudsen (Ed.): FSE’99, LNCS 1636, pp. 171-185, 1999. 

© Springer-Verlag Berlin Heidelberg 1999 
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The core of a Luby-Rackoff cipher is a Feistel network. The use of Feistel 
networks for cipher design is not new, in fact it is one of the central design prin- 
ciples of DES. The original construction of Luby and Rackoff consists of four 
Feistel permutations, each of which requires the evaluation of a pseudorandom 
function. The proofs of security that were provided were subsequently simplified 
by Maurer [9] where he provided a rather generalized treatment based on infor- 
mation theoretic (as opposed to complexity theoretic) ideas. In what follows we 
review some of the more popular results in this field. 

We start with the problem of producing a 2n bit pseudorandom permutation 
from n bit pseudorandom functions. Luby and Rackoff used the method of Feistel 
Networks. Clearly, from an information theoretic point of view, one would need at 
least two rounds of n bit pseudorandom functions (whose entropy is n bits) since 
the entropy of the permutations required is 2n bits. Now we check if two rounds 
are enough. Here Luby and Rackoff showed that if only two rounds were used 
and if an attacker chose two different inputs with the same “right half of n bits” 
then he can easily distinguish the outputs from a truly random permutation. 
Hence they suggested the use of at least three rounds, which is secure against 
chosen plaintext attacks. As it turned out, three rounds was not resistant to 
adaptive chosen ciphertext attacks, and in fact they showed that four rounds 
was sufficient to guarantee resistance to adaptive chosen plaintext and ciphertext 
attacks. The proof involved choosing four different pseudorandom functions in 
the four rounds. 

Following this work, and the paper of Maurer [9], Lucks further generalized 
the proofs to include unbalanced Feistel networks. The main contribution of his 
work is the notion of a difference concentrator which is a non-cryptographic 
primitive that replaces the pseudorandom function in the first round but yet 
offers the same security. Parallel to this, a lot of research has concentrated on 
obtaining variants of Luby-Rackoff constructions where the number of different 
pseudorandom functions used in the four rounds is minimized. For example, 
see [12], [16]. This minimization is motivated by the fact that pseudorandom 
functions are computationally intensive to create and hence any reduction in the 
number of different functions used directly leads to a more efficient construction. 
Following these works, Naor and Reingold [10], established a very efficient gen- 
eralization, where they formalized Lucks’ treatment by using strongly universal 
hash functions. In [10], they achieve an improvement in the computational com- 
plexity by using only two applications of pseudorandom functions on n bits to 
compute the value of a 2n bit pseudorandom permutation. The central idea is to 
sandwich the two rounds of Feistel networks involving the pseudorandom func- 
tions between two pairwise independent 2n bit permutations. In other words, 
let / be a pseudorandom function on n bits and let hi , /i 2 be two pairwise inde- 
pendent 2n bit permutations (for example hi{x) = OiX + bi mod p, where Oi,bi 
are uniformly distributed). Then Naor and Reingold proved that hi followed by 
two Feistel rounds using / followed by /i 2 is a 2n bit pseudorandom permuta- 
tion. They generalized this construction by relaxing the pairwise independence 
condition on the exterior permutation rounds but changed the interior rounds 
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to include two different pseudorandom functions. This piece of work represents 
the state of the art in efficiency related to Luby-Rackoff ciphers. 

Another line of research has focused on how to enhance the security of Luby- 
Rackoff ciphers. Patarin [12] has shown that a Luby-Rackoff permutation can be 
distinguished from a random permutation using 0(2 ^ ) queries. In a related work, 
[1] showed how to obtain pseudorandom functions on 2n bits from pseudorandom 
functions on n bits using Benes networks. More recently, Patarin [13] has shown 
that six rounds of the Luby-Rackoff construction (instead of four) results in 
a pseudorandom permutation which cannot be distinguished from a random 
permutation with advantage better than 0(p;r); where m is the number of 
queries. 

In this paper we show some new constructions of more practical Luby-Rackoff 
block ciphers from the standpoint of efficiency of computations. In addition, we 
provide security guarantees for Luby-Rackoff ciphers under weaker and more 
practical assumptions about the underlying primitive. We start with the Naor- 
Reingold construction but in a more general context of Abelian groups (as op- 
posed to n bit vectors) and introduce new improvements in efficiency. With the 
same pseudorandom function in rounds 2 and 3, our universal hash functions in 
the 1st and 4th rounds operate only on half the data as opposed to the entire 
data thereby improving on the Naor-Reingold construction. Note that the round 
functions involve the group operation for encryption, and the difference opera- 
tion for decryption as opposed to the usual XOR operation for both encryption 
and decryption. 

We employ a novel construct called a Bi- symmetric e — A-Universal Hash 
Function. We also give an example of such a Bi-symmetric e — Z\-universal hash 
function that can be implemented much more efficiently than standard universal 
hash function constructions. Another interesting variation in the proof of security 
is that we show that even if the underlying function is only a secure Message 
Authentication Code (as opposed to the much stronger pseudorandom function) 
no adversary can easily invert Luby-Rackoff block ciphers. Finally, we provide a 
SHA-1 based example block cipher. 

This paper is organized as follows: Section 2 includes the preliminaries, 
followed by the main results in section 3. Here we define the notion of a Bi- 
symmetric e — A universal hash function, and provide an example. This concept 
is central to our proof of the main theorem. The security of Luby-Rackoff ciphers 
using a secure MAC is discussed in section 4. A SHA-1 based example cipher is 
included in section 5. In section 6 we discuss issues related to the optimality of 
Luby-Rackoff ciphers. 

2 Preliminaries 

Let Lf be a family of functions going from a domain D to a range G, where G 
is an Abelian Group with and as the addition and subtraction operators 
respectively. Let e be a constant such that l/]Gj < e < 1. The probabilities 
denoted below are all taken over the choice oi h G H. 
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Definition 1. H is a A— universal— family of hash functions if for allx,y € D 
with X ^ y, and all a G G, Pr[h{x) — h{y) = a] < 1/|G|- H is called e — 
almost — A — universal if Pr[h{x) — h{y) = a] < e. An example is the linear 
hash h{x) = ax mod p with a ^ 0. 

Definition 2. H is a strongly universal family of hash functions if for allx,y € 
D with X ^ y and all a,b G G, Pr[h{x) = a, h{y) = 6 ] < l/|Gp. PI is e—almost— 
strongly — universal family of hash functions if Pr[h{x) = a,h{y) = 6] < e/|G|. 
An example is the linear congruential hash h{x) = ax + b modp with a ^ 0. 

Definition 3 (Basic Feistel Permutation). Let f be a mapping from G to G. 
We denote by f the permutation on G x G defined as f{x) = {xr, (xr + f{xR)) 
where x = (xr,xr), and xr,xr € G. 

Definition 4 (Feistel Network). Let fi,...,fs be mappings from G to G, 
then we denote by 'P{fi, . . . , fs) the permutation on G x G defined as 

^(fl,---,fs) = fs° ■■■° fi ( 1 ) 

Theorem 1 (Luby-RackofF). The permutation on G x G defined by 

'P(flj2j3,f4) (2) 

where the fi are keyed pseudorandom functions is a secure block cipher in the 
sense that it cannot be distinguished from a random permutation by a polyno- 
mially bounded adversary who mounts an adaptive chosen plaintext/ ciphertext 
attack. 

3 Improving Luby-RackofF Ciphers 

In this section we provide a construction and proof of a more optimized version 
of a Luby-Rackoff cipher. Our construction is more practical than the one given 
by Naor and Reingold in [10] - which is currently the state of the art in Luby- 
Rackoff style block ciphers. Here is the main theorem proven in [10]: 

Theorem 2 (Naor-Reingold). iet/ 1,/2 be keyed pseudorandom functions. 
Let /ii,/i 2 be strongly universal hash functions which as permutations on G x G. 
Then, the permutation on G x G defined by 



h2 °'P{fi,f2) ° hi (3) 

is a secure block cipher in the sense that it cannot be distinguished from a ran- 
dom permutation by a polynomially bounded adversary who mounts an adaptive 
chosen plaintext/ ciphertext attack. 

The Naor-Reingold construction was a major breakthrough in the design of 
Luby-Rackoff ciphers since they were able to completely remove two calls of the 
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expensive pseudorandom functions, and in some sense replace them with much 
more efficient non-cryptographic strongly universal hash functions. 

Naor and Reingold mentioned two possible optimizations to their original 
construction and proved them to be secure block ciphers. The first is to use the 
same PRF in rounds two and three, thus saving key material: /i 2 ° ° fit- 

The other possible optimization is to use the construction {hi , fi,f 2,^2) where 
the hi are e— Z\-universal hash functions which now operate on only half the data, 
as opposed to the entire 2n bit data. This construction saves running time and 
key material. Unfortunately, trying to realize both optimizations simultaneously 
(<F(/ii , /, /, /i 2 )); leads to an attack. ^ In particular, suppose that the e — A- 
universal hash function we use is the linear hash {ha{x) = ax) over the field 
GF(2”). First we encrypt (0,0) which results in the left half of the ciphertext 
being V = /(/(hi(0))) -|-/ii(0). Then we decrypt (0, 0) also resulting in the right 
half of the plaintext being R = /(/(/i2(0))) + /i2(0). When ha{x) = ax, then 
fii(O) = 0 = /i2(0). Thus V=R clearly allowing us to distinguish the cipher from 
a random permutation. Groups where addition and subtraction are different may 
not be susceptible to this attack. 

This raises the question of whether or not one can use the same PRF’s in 
rounds two and three and have an efficient non-cryptographic function operating 
on only half the bits in rounds 1 and 4. In this paper, we give a construction 
which answers this question in the affirmative. 

We employ a novel construct called a Bi- symmetric e — A-Universal Hash 
Function instead of the standard universal hash functions given in [10]. We also 
give an example of such a Bi-symmetric-Z\-universal hash function that can be 
implemented much more efficiently on many platforms than standard universal 
hash function constructions. Using these Bi-symmetric e — A-Universal Hash 
Functions in rounds one and four, and the same PRF in rounds 2 and 3, we can 
get a secure and efficient Luby-Rackoff style cipher. 

These hash functions, as deffned below, possess two properties: the first one is 
the usual e — Z\-universal property. The second property is different, and enables 
us to prevent the aforementioned attack. 

Definition 5. Let G be an Abelian Group and let and '+' denote the sub- 
traction and addition operations with respect to this group. Then H is a fam- 
ily of Bi-symmetric e — A-universal Hash Functions if for all x,y with x ^ 
y, and all S, Prh^H[h{x) — h{y) = 5] < e and for all x',y' and for all S 
Pvhi^h2eH[hi{x') -b h2{y') = ^] < e. 

Typically we want e to be extremely small - around 0(1/2”) for inputs and 
outputs of size n bits. 

Theorem 3. Let /ii,/i 2 be Bi-symmetric e — A-universal hash functions, and 
let f be a random function. The block cipher defined by a four round Feistel 
network <F(/ii , /, /, /i 2 ) cannot be distinguished from a random permutation with 
probability better than 0{ivf ■ e) where m is the number of queries made by the 

^ Daniel Bleichenbacher, personal communication 
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adversary (who may have unlimited power) , and all the operations are performed 
in an Abelian group G. 

Maurer [9] presented a very simple proof of security of the original Luby- 
Rackoff construction. Naor-Reingold [10] gave a more formal structure for prov- 
ing adaptive security of Luby-Rackoff ciphers. Fortunately, the conditions that 
need to be satisfied for the security of the block cipher as presented by a Maurer 
type treatment are the same as the conditions resulting from the more formal 
treatment of Naor-Reingold. Hence the proof we present follows the simpler 
presentation of [9], but only proves security in the non-adaptive case. We can, 
however, easily prove adaptive security by following the treatment in [10]. 

Proof. Suppose we make m plaintext/ciphertext queries to a black box which 
can encrypt/decrypt. Moreover, we make these queries in a non-adaptive fashion; 
e.g. we make them all at once. Based on the responses of the black box, we must 
determine whether the black box contains a truly random permutation or a 
permutation of the form \P{hi, /, /, / 12 ). 

Now, suppose we have a 2n bit plaintext message Mi where we denote the 
leftmost n bits as L, and the rightmost n bits as R,. The following equations 
describe the process by which this plaintext is encrypted by the Feistel network 
we consider: 



Si — Li + hi (Ri) 


(4) 


Ti=Ri + f{Si) 


(5) 


Vi=Si + f{Ti) 


(6) 


Wi=Ti+h2{Vi) 


(7) 



where -|- is the addition operation in the group G. Here (V,W) is the encrypted 
output of the plaintext (L,R). Similarly we can describe the decryption pro- 
cess where the “inputs” are (V,W) and the “outputs” are (L,R). Suppose that 
following transcript describes the responses of the encryption/decryption black 
box: 

((Ll, Rl), (Ri, Wi )),... , ((L™, Rm), (Vm,Wm)) (8) 

We can assume without loss of generality that these queries are all distinct 
because making the same query twice doesn’t help you. That is for all 1 < i < 
j <m (Li,Ri) ^ (Lj,Rj) and (R,Wj) ^ (V},Wj). Now, let £s denote the event 
that the elements in the set S = {Ri , . . . , Rm} are all different, and let £t denote 
the event that the elements in the set T = {Ti , . . . , T^} are all different (where R, 
and Ti are defined as above). Finally, let £st denote the event that the sets S and 
T are disjoint. Now, we consider what each event entails. First let’s examine the 
case that the query made was an encryption query; i.e. the input was {Li, Ri) and 
the output was (Vi,Wi) . If £t occurs, then all the R’s will be random because 
Vi = Si + f{Ti) and since the T,’s are distinct and / is a random function, it 
follows that Si + f{Ti) is random. Similarly, if both £s and £st occur, then Wi 
will be random. This happens because Wi = R + /i 2 (R) = Ri + f{Si) + /i 2 (R) 
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and f{Si) is random because / is a random function, and the S'j’s are distinct 
and different from all the Tj’s. 

Now we look at the case for decryption queries. The inputs in this case are 
(Vi,Wi) and the outputs are (Li,Ri). Following the same lines of reasoning as 
above, we want the “outputs” {Li, Ri) to be random. This happens if the inputs 
to the random functions in rounds 2 and 3 are distinct. Therefore, the conditions 
we need are specified by the events Ss, St and Sst- 

Now, if St^ Ss, and Sst all occur, then the adversary cannot distinguish our 
Feistel permutation from a random permutation, except with negligible probabil- 
ity less than [10]. All that remains is to derive a bound on the probability 

that at least one of these events does not occur. Let Sg, S^ , Sgj^ denote the 
complements of these events. We proceed to derive our bounds: 

S''^\S‘s or St\ < Ri<i<j<m Pr[T, = Tj] + Ri<i<cj<m Pr[S’j = Sj] ( 9 ) 



Pv[Wi - h2{Vi) = Wj - h2(Vj)} (10) 

+ Pr[Lj + hi{Ri) = Lj + hi{Rj)] (11) 

Pv[h2{Vi) - h2(Vj) =Wi- W,] (12) 

+ Sji<i<j<m Pr[/ii(J?j) — hi{Rj) = Lj — Li] (13) 

< m(m — l)/2 • (e -b e) (14) 



where the last inequality follows from the previous one by the following argu- 
ment: if Vi = Vj (similarly Ri = Rj) then it follows that Wi ^ Wj (similarly 
Lj ^ Li) by distinctness of queries; hence the event occurs with probability 
0 in this case. If V) ^ Vj (similarly Ri ^ Rj), then the bound follows by the 
Bi-symmetric e — Z\-universality property of the hi. 

Moreover, the above analysis holds regardless of whether the Si or T, were 
generated by the process of an encryption query or a decryption query. All that 
remains is to bound the probability that Sst does not occur: 

P^[^Ft] — '^i<*<i<m Pr[5’j = Tj] (15) 

— ^i<i<j<m Pr[Aj + hi{Ri) = Wj — /i2(Vj)] (16) 

— ^i<i<j<m Pr[/ii(^?j) + h2{Vj) = Wj — Li] (17) 

<m^€ (18) 

where the last inequality follows from the previous one by the Bi-symmetric 
e — Z\-universality property of H. Finally, we can apply a union bound to get: 

Pr[f 5 or S^ or Sgj^] < 2m^ ■ e (19) 

This concludes the proof. □ 

3.1 An Example Bi-symmetric e — /i-universal hash function 

Definition 6. Letp be a prime. Define the square hash family of funetions from 
Zp to Zp as: 

(20) 



hx{m) = {m + x)'^ mod p 




i / o 
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Theorem 4. The square hash family of funetions is A — universal . 

Proof: For all m ^ n € Zp, and S € Zp-. Prx[hx{m) — hx{n) = 5 ] = Pra;[(m + 
x)^ — {n + x)^ = 5] = Pra;[m^ — + 2{m — n)x = 5] = 1/p, where the last 

inequality follows since for any given m ^ n G Zp and S G Zp there is a unique 
X which satisfies the equation + 2{m — n)x = S. □ 

Theorem 5. The square hash is a Bi-symmetrie e — A-universal family of hash 
funetions. 

Proof: We have already proved the first property and need to prove the 
property that for all m,n and for all S Pra;^ygij[/ia;(m) + hy(n) = ^] < e. 
Prx,y[{m + x)'^ + {n + y)^ = S mod p] = Prx,y[x‘^ + 2xm + m? + + 

2ny — 5 = 0 mod p] There are at most 2 solutions for x for a given key y, so 
altogether there are at most 2”+^ solutions out of 2^” possible keys x,y. So 
+ hx{n) = 5] = ^ = ■ If we count the number of solutions by 

further reducing this modulo 2”, then e increases by a small factor. □ 

We remark that the function hx{m) = (m + x)^ mod p mod 2” is also a Bi- 
symmetric e — Z\-universal family of hash functions for a slightly larger value of 
e. This function is more useful for implementation purposes since we often work 
over addition modulo 2” (e.g. bit strings of length n). The Square Hash family 
requires much less computation time on many platforms than the traditional 
linear hash. The speedup occurs because squaring an n-bit number requires 
roughly half the number of basic word multiplications than actually multiplying 
two n-bit numbers. Thus square hash requires fewer operations and instructions 
to implement. More details can be found in [14]. 

4 Proving Security Under MAC Assumption 

We give an alternate proof of security of our construction. This proof utilizes 
a weaker, but perhaps more practical, assumption, and makes a weaker claim 
on security of our block cipher. In particular, we show that if the underlying 
function / in our Feistel Network works as a secure Message Authentieation 
Code (MAC), then it is infeasible for an adversary to come up with a random 
plaintext /ciphertext pair after mounting an adaptive chosen plaintext/ciphertext 
attack. Some earlier work on the relationship between unpredictability (MACs) 
and indistinguishability was studied in [11]. We now define the relevant notions 
and then prove our claim. 

The goal of message authentication codes is for one party to efficiently trans- 
mit a message to another party in a way that enables the receiving party to 
determine whether or not the message he receives has been tampered with. The 
setting involves two parties, Alice and Bob, who have agreed on a pre-specified 
secret key x. There are two algorithms used: a signing algorithm Sx and a veri- 
fication algorithm 14 . If Alice wants to send a message M to Bob then she first 
computes a message authentication code, or MAC, pt = Sx{M). She sends {M,iT) 
to Bob, and upon receiving the pair. Bob computes Vx{M,p) which returns 1 if 
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the MAC is valid, or returns 0 otherwise. Without knowledge of the secret key 
X, it should be next to impossible for an adversary to construct a message and 
corresponding MAC that the verification algorithm will be accept as valid. 

The formal security requirement for a Message Authentication Code was 
defined by Bellare, Canetti and Krawczyk [3]. In particular, we say that an 
adversary forges a MAC if, when given oracle access to (Sx,Vx), where x is 
kept secret, the adversary can come up with a valid pair (M*,n*) such that 
Vx{M* , n*) = 1 but the message M* was never made an input to the oracle for 
Sx- Here is a formal definition of an alternate, but equivalent formulation: 

Definition 7. We say that a function fk is an (e, q)-secure Message Authenti- 
cation Code (MAC), if the probability that an adversary can successfully perform 
the following experiment is bounded by e: 

1. (Adversary is given black box access to fk) The adversary makes q possibly 
adaptively chosen queries to a black box for fk - that is the adversary comes 
up with a message nii gets to see /*(mi), and from this information comes 
up with a query m 2 , gets to see fk{m 2 ), and so on until making a query mg, 
and getting to see /(m,). 

2. The adversary comes up with a pair {m, fk{m)) where m is different from 
any message queried in the first part. That is, m ^ m, for 1 <i < q. 

Message Authentication has been a well studied problem, and there are a 
number of schemes which are widely believed to be secure. We show that we can 
use any of these schemes as the underlying function / in our block cipher and 
still get a fairly secure encryption scheme. 

We now explain what it means for a cryptosystem to be randomly-secure 
against adaptive chosen message/ciphertext attacks. The definition has a similar 
fiavor to the above definition for MACs. 

Definition 8. We say that a secret key encryption algorithm Ek is (e, q)-randomly- 
secure against adaptive chosen plaintext/ ciphertext attacks if no adversary can 
successfully perform the following experiment with probability better than e: 

1. (The adversary is given black box access to Ek) The adversary makes q 
possibly adaptively chosen plaintext/ ciphertext queries to a black box for Ek; 
for example, the adversary comes up with a message mi gets to see Ek{mi), 
and from this information comes up with a query m 2 , gets to see Ek{m 2 ), 
and so on until making a query mg and getting to see Ek{mg). 

2. For a given randomly chosen message (or ciphertext) the adversary comes up 
with the corresponding ciphertext (or plaintext). This message (ciphertext) 
is different from any message (ciphertext) queried in the first part. 

We remark that there are alternate, and more stringent, definitions of security 
for symmetric key encryption schemes. 

Theorem 6. Let f be a {ei,2q)-secure MAC, and let /ii,/i 2 be e 2 -A-universal 
hash functions . Then, \F{hi, /, /, /i 2 ) is (ei+g^e|, q) -randomly-secure under adap- 
tive chosen plaintext/ ciphertext attacks. 
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Proof. The overall idea is to show that given any adversary A' who can break 
the encryption scheme (given black box access to the encryption and decryption 
mechanisms), we can construct an adversary A who can break the underlying 
MAC / (given black box access to the MAC function /). A will work as follows. 
First, A picks two hash functions hi and /i2 at random from a family of 62- 
Z\-universal hash functions. Then A proceeds simulating A'. At some point A' 
is going to make a query which could be in either of two forms: “Please give 
me the encryption of a message m” or “Please give me the decryption of a 
ciphertext c.” In either case, A must answer this query for A' in a legitimate 
fashion. A can do this easily by making two calls to the black box for / and 
simulating the encryption or decryption algorithms. For example, if the ith query 
is an encryption query on a message Mi = (Li,Ri), then A computes values 
S'* , Tj , V) , IFj according to equations 4, 5, 6, and 7. 

We see that A calls the black box for / whenever it computes T, and V). 
Decryption queries are handled in a similar fashion. Now, after A' finishes making 
q queries, (which results in A having made 2q queries) it will output a ciphertext 
{V, W) and the corresponding plaintext (L, R) - if the plaintext is different from 
plaintexts given during all the encryption queries made by A' , and the ciphertext 
differs from the ones given during the decryption queries made by A', then A' 
has successfully broken the encryption scheme. Now, we translate this break of 
the encryption into something that breaks the underlying MAC / with high 
probability. So, we now may have derived two potential (Message, Tag) pairs. A 
computes: 



S = L + hi{R) (21) 

T = W-h2{V) (22) 

Tag{S) =T-R (23) 

Tag{T) = V-S (24) 

If you work out the equations, it’s easy to see that Tag{S) = f{S) and 



Tag{T) = /(T). Therefore, (S,Tag{S) and (T,Tag{T)) are valid Message/Tag 
pairs. It appears as if we are done, but there is still one technicality remaining. 
Recall that when we deffned the notion of a secure MAC, we require that the 
message output by the adversary must differ from the messages that the adver- 
sary gave to the black box during the query phase. We must now bound the 
probability that S = Si for some i {1 <i<q). 



Pr[3i 1 < i < q such that S = S'*] (25) 

= Pr[3i such that L + hi{R) = Li + hi{Ri)] (26) 

< Sf^,Pr[L + hi(R) = Li + hi(Ri)] (27) 

< gc2 (28) 
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The last equation follows from the previous because hi is an e 2 -Z\-universal 
hash function. Similarly, one can show that 

Pr[3i 1 <i < q such that T = T,] < gc 2 (29) 

Since the hash functions hi and /i 2 were chosen independently at random, it 
follows that likelihood that both S = Si and T = Ti is at most (qe 2 )^. 

What remains is to find a lower bound for the success probability of A' . We 
can do this by first deriving a lower bound for the success probability of A. A is 
successful whenever both of two conditions happen: 

Condition 1 A' outputs a message and a valid ciphertext for that message, 
(i.e. A' is successful.) 

Condition 2 At least one of the S, T that are generated do not coincide with 
something that was queried before; i.e. you can use the message and cipher- 
text to generate a valid message/tag pair where the message was not part of 
a previous query. 

An easier way to proceed is to derive an upper bound for the probability 
that A fails, and subtract that number from 1. Suppose that the probability 
of condition 1 being met is at least es - then Pr[Condition 1 doesn’t happen] 
< (1 — es). Now, we know that condition 2 fails to occur with probability at 
most (gc 2 )^. Therefore, by a union bound, Pr[A is unsuccessful] < 1 — es -P (gc 2 )^. 
Therefore, Pr[A is successful] > es — (gc 2 )^. Now, by assumption we have that 
Pr[A is successful] < ei. Therefore, it follows that es < ei -P g^e| - which is a 
bound on success probability of A' . We have thus shown that we break the cipher 
with probability at most ci -P g^e| after making q queries. And this concludes 
the proof. □ 

5 An Example Block Cipher: Sha-zam 

In this section we discuss the design of Sha-zam, a block cipher based on con- 
structions and theorems proved earlier. We use SHA-1 as the underlying prim- 
itive instead of a family of pseudorandom functions. Replacing pseudorandom 
functions by cryptographic functions (with desired properties) is not new. Bi- 
ham and Anderson [2], propose the use of SHA-1 in conjunction with stream 
ciphers to design block ciphers. Also, Lucks used MD5 with an unbalanced Feis- 
tel network and Guttman’s construction uses SHA-1 but different from the Luby 
Rackoff construction. In our design we do not use any stream ciphers. Instead 
we rely entirely on the improved versions of the Luby Rackoff construction and 
use SHA-1 as our underlying primitive. Recall that in section 4 we showed that 
Luby-Rackoff ciphers are secure if the underlying primitive is a secure MAC. Here 
security is with respect to some form of invertibility. From a practical standpoint 
the use of SHA-1 is justified. For example, the internet RFC HMAC-SHA-1 [4] 
assumes that forging a tag using SHA-1 as the underlying MAC is hard. 

In addition to SHA-1 we use the Square Hash function (SQH) which we have 
introduced earlier. This cipher is driven by a key scheduling generator whose 
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security is also related to SHA-1. Hence we get efficient re-use of code. Based on 
our results in earlier sections, under the assumption SHA-1 is pseudorandom we 
have that Sha-zam is a block cipher secure against adaptive chosen plaintexts 
and ciphertext attacks. Under the weaker assumption that SHA-1 is a secure 
MAC we show that no adversary can invert Sha-zam. 

If C is an n-bit string, we denote by prefixk{C) the n — k bit prefix of C 
(i.e. the first n — k bits of C). We denote by SHA(IV,x) the 160 bit output 
produced by SHA-1 on a 512 bit user specified input x and the standard IV. 
Our block cipher, which we call Sha-zam, takes as input a 320 bit block M and 
outputs a 320 bit ciphertext C. We denote M = (L,R) where L is the left 160 
bits of M and R is the right 160 bits of M. Also, we prefer to keep IV secret. In 
our construction we use three keys fci , ^ 2 , fcs . 

Encryption with Sha-zam 

Input: Plaintext Stored in L, R - each of which is 160 bits 

Private Key: k = {ki,k 2 ,ks) where: 
ki , fcs are 160 bits each, and ^2 is 352 bits. 

If IV not secret: then use the standard 160 bit IV. 

Output: Ciphertext stored in V,W - each of which is 160 bits 

Procedure: S = L + SQHk\{R) mod 2^®° 

T = R + SHA(IV, S, k 2 ) mod 2i“ 

V = S + SHA{IV, T, k 2 ) mod 2i®° 

W = T + SQHk, (V) mod 2i“ 



Decryption with Sha-zam 

Input: Ciphertext Stored in V,W - each of which is 160 bits 

Private Key: k = {ki,k 2 ,ks) where: 
ki , fcs are 160 bits each, and ^2 is 352 bits. 

If IV note secret: then use the standard 160 bit IV. 

Output: Plaintext stored in L,R 

Procedure: T = W - SQHk.JV) mod 2^®° 

S = V - SHA(IV, T, k 2 ) mod 2i®° 

R = T- SHA(IV, S, k 2 ) mod 2^®° 

L = S- SQHk, (R) mod 2^®° 



Our block cipher can be implemented efficiently. Specifically, it can encrypt 
messages in roughly the same time as it would take DES to accomplish this same 
task. 
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We describe a practical and secure pseudo-random generator, which we use 
for key scheduling. The security is based on SHA-1. If we run our generator 
using a randomly selected key as an input seed, we can securely generate the 
necessary bits needed for the secret key of our block cipher. We make use of a 
512 bit prespecified global constant C . We almost never use the entire constant 
C but often take some specified prefix of it depending on the length of the key 
we’re working with. We now describe our generator. Given a seed s we generate 
pseudorandom bits as follows: 

Description of Generator 
Input: seed s 

1. Let So = s 

2. For i = 1 to m do 

Let Si = SHA(IV,prefix(C),Si^i) 

3. Output: (/i(si), . . . , h{sm)) 



In step 3 above, h refers to a hash function chosen from a universal class. 
For example, the linear congruential hash function is any finite field is a very 
good candidate. The proof of security of this key scheduling generator will be 
presented elsewhere, [15]. 

6 A Discussion on Optimality 

Since the invention of Luby-Rackoff ciphers, considerable progress has been made 
with respect to making the construction more efficient. Specifically, as noted in 
the introduction, most of the focus has been in “reducing” the number of invo- 
cations of a random function and the amount of key material used. Following 
the work of Lucks [8] , Naor-Reingold have produced extremely efficient construc- 
tions with the help of hash functions and just two calls to a random function. In 
the present work, we have described a further generalization by using a different 
class of hash functions operating on only half the size of the input, in addition to 
a reduction in the key material. Is the end of progress in sight? We now discuss 
what it might mean for a Luby-Rackoff cipher to be optimal. We present various 
parameters of interest, and discuss how our proposal fits within this discussion. 

1. Minimal number of rounds: Luby-Rackoff in their original paper showed that 
two rounds are not enough and 3 rounds are needed for plaintext security. 
Furthermore in order to resist adaptive attacks four rounds are needed. Our 
construction also consists of 4 rounds. 

2. Maximal Security: Patarin [12] showed that the 4 round Luby-Rackoff con- 
struction can be distinguished from a random permutation with probability 
O(^) with m queries. We meet this bound as stated in Theorem 1. We can 
reduce the distinguishing probability by increasing the number of rounds, 
but this would violate the previous critirea. 
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3. Minimal Rounds of PRFs: Since the output of the block cipher is 2n bits long, 
it would seem that two n bit PRFs are necessary to insure cryptographic 
security. We also use only two PRFs in rounds two and three. For rounds 
one and four we use non-cryptographic called Bi-symmetric e — Z\-universal 
functions, which add to the efficiency significantly. 

4. Reusing the same PRF: R has been the goal of many papers to reduce the 
number of different PRFs that are used, ultimately hoping to use just one 
PRF. We achieve this goal by just using one PRF in rounds two and three. 

5. Minimal Data Size Operated on by Non-cryptographic function: Our Bi- 
symmetric e — Z\-universal hash function in rounds one and four operate 
on n bits of the data. If we operated on any smaller part of the data then 
we would open ourselves to collisions that can be detected with lower num- 
ber of queries, thus increasing the distinguishing probability and decreasing 
security. 

6. Reusing Hash Functions: It might be tempting to use the same universal hash 
function in rounds 1 and 4 to save key material. However, using the same 
hash h in both rounds, in groups where g = —g for all g G G, unfortunately 
leads to an attack which we now describe. 

Consider the group of n bit vectors with respect to the usual XOR op- 
eration. When we encrypt (0,0), the left half of the resulting ciphertext 
is R = /(/(/i(0))) -I- /i(0). Then we decrypt (0,0) also, thus resulting in 
R = /(/(/i(0))) -|-/i(0). Since V and R are equal, we immediately distinguish 
the block cipher from a random permutation. 

In our construction we may be able to use the same h. This seems plausible 
due to the fact our network operates with Abelian groups whose operations 
are not symmetric (i.e. -|- and - are different), where we may be able to exploit 
the use of specialized hash functions with the property Prh£H[h{x) + h{y) = 
5] < e for all x, y, and S. When working in the cyclic group of integers modulo 
2” , we note that the Square hash function satisffes this specialized property. 
In particular, even if the above property does not suffice, other properties of 
square hash might aid in proving our claim. We leave this for further study. 



7 Conclusion 



In this paper we have described some novel improvements to Luby-Rackoff ci- 
phers. We introduce the concept of a Bi-symmetric e — A universal hash function 
and provide an example of such a class. This concept when applied to the Naor 
Reingold construction of a Luby-Rackoff cipher improves the efficiency. In ad- 
dition we show that under the weaker and more practical assumption of secure 
MAC we show that Luby Rackoff ciphers are hard to invert. We discuss the 
design of a new cipher based on these improvements - Sha-zam, whose security 
is related to SHA-1. 
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Abstract. We study the functions from into for odd m which 
oppose an optimal resistance to linear cryptanalysis. These functions are 
called almost bent. It is known that almost bent functions are also almost 
perfect nonlinear, i.e. they also ensure an optimal resistance to differen- 
tial cryptanalysis but the converse is not true. We here give a necessary 
and sufficient condition for an almost perfect nonlinear function to be 
almost bent. This notably enables us to exhibit some infinite families of 
power functions which are not almost bent. 

1 Introduction 

This paper is devoted to the study of the functions / from F™ into which 
achieve the highest possible nonlinearity. This means that any non-zero linear 
combination of the Boolean components of / is as far as possible from the set 
of Boolean affine functions with m variables. When m is odd, the highest possi- 
ble value for the nonlinearity of a function over F 2 m is known and the functions 
achieving this bound are called almost bent. These functions play a major role in 
cryptography; in particular their use in the S-boxes of a Feistel cipher ensure the 
best resistance to linear cryptanalysis. It was recently proved that the nonli- 
nearity of a function from F™ into F™ corresponds to the minimum distance of 
the dual of a linear code C/ of length (2™ — 1). In particular when / is a power 
function, / : x >->■ a:®, this code C/ is the cyclic code C\^s of length (2™ — 1) 
whose zeros are a and a'* (a denotes a primitive element of F 2 m). It was also 
established ^ that if a function over F™ for odd m ensures the best resistance 
to linear cryptanalysis, it also ensures the best resistance to differential crypt- 
analysis. For the associated code C/, this means that if its dual (or orthogonal) 
code, denoted by C^, has the highest possible minimum distance, then C/ has 
minimum distance at least 5. But the reciprocal does not hold. Using Pless po- 
wer moment identities and some ideas due to Kasami [El, we make this 
condition necessary and sufficient by adding a requirement on the divisibility of 
the weights of Since the divisibility of the weights of the cyclic code 
is completely determined by McEliece’s theorem im, the determination of the 
values of s such that the power function x 1 — >■ x® is almost bent on F 2 m is now 
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reduced to a combinatorial problem. This notably yields a very fast algorithm 
for checking if a power function over F 2 m is almost bent, even for large values 
of m. McEliece’s theorem can also be used for proving that contains a code- 
word whose weight does not have the appropriate divisibility. We are then able 
to prove that, for some infinite families of values of s the power function a; i— >■ x® 
is not almost bent on F 2 ™. 

The next section recalls the link between the weight distribution of the duals 
of cyclic codes with two zeros and the nonlinearity of a function from F 2 ^ into 
F 2 m. In Section 3 we develop a new theoretical tool for studying the weight 
distribution of some linear codes, which generalizes some ideas due to Kasami. 
Combined with McEliece’s theorem, this method provides a new characterization 
of almost bent power mappings. Section 4 then focuses on power functions x 1 — > 
X® over F 2 m for odd m when the exponent s can be written as s = -|-2* — 1. 

This set of exponents contains the values which appear in both Welch’s and 
Niho’s almost bent functions. We here prove that for most values of i, x 1 — >■ 

m — 1 

x2 +2 -1 jg almost bent on F 2 "* . In Section 5 we finally give a very simple 
necessary condition on the exponents s providing almost bent power functions 
on F 2 m when m is not a prime; in this case we are able to eliminate most values 
of s. We also prove that the conjectured almost perfect nonlinear function x 1 — >■ x® 
with s = 2'^® -I- 2^® -I- 229 -I- 2® — 1 over F 25 S is not almost bent. 

2 Almost Bent Functions and Cyclic Codes with Two Zeros 

2.1 Almost Perfect Nonlinear and Almost Bent Functions 

Let / be a function from F™ into F™. For any (a, b) G F™ x F™, we define 

b) = #{x e F™, /(x -k a) -k /(x) = b} 

A/(«, b) = |#{x G F™, a • X + 6 • fix) = 0} - 2™-i| 

where • is the usual dot product on F™. These values are of great importance 
in cryptography especially for measuring the security of an iterated block cipher 
using / as a round permutation 0. A differential attack |2] against such a 
cipher exploits the existence of a pair (a, 6) with a yf 0 such that 5y(a, 6) is high. 
Similarly a linear attack m is successful if there is a pair (a, b) with 6 yf 0 such 
that A/(a, &) is high. The function / can then be used as a round function of an 
iterated cipher only if both 

S f = maxinaxSfia,b) and Ay = maxmax Ay(a, 6) 

a^O b a b^O 

are small. Moreover if / defines the S-boxes of a Feistel cipher, the values of 
and Xf completely determine the complexity of differential and linear cryptana- 
lysis [L! 1 121 )j . 

Proposition 1. ms For any function f : F™ — ^ F™, 

Sf>2. 

In case of equality f is called almost perfect nonlinear (APN). 
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Proposition 2. HP For any function f : F™ — ^ F™, 

A/ > 2"^ . 

In case of equality f is called almost bent (AB). 

Note that this minimum value for A /can only be achieved if m is odd. For even m, 
some functions with A/ = are known and it is conjectured that this value is 
the minimum pni p. 603]. 

From now on the vector space F™ is identified with the finite field ¥ 2 ^ . The 
function / can then be expressed as a unique polynomial of F 2 m[X] of degree 
at most (2™ — 1). Note that the values of <5/ and A/ are invariant under both 
right and left compositions by a linear permutation of F 2 "i. Similarly, if / is 
a permutation, Sf = Sf-i and A/ = A/-i. We can then assume that /(O) = 0 
without loss of generality. 

Both APN and AB properties can also be expressed in terms of error-correc- 
ting codes. We use standard notation of the algebraic coding theory (see [I bj ) . 
The (Hamming) weight of any vector x G ¥2 is denoted by wt{x). Any linear 
subspace of ¥2 is called a binary linear code of length n and dimension k and is 
denoted by [n, k]. Any [n, fc]-linear code C is associated with its dual [n, n — k]- 
code, denoted by C^: 

C^ = {xG¥^, a;-c=0VceC} . 

Any r X n binary matrix H defines an [n, n — r]-binary linear code C: 

C = {cGF^, ciJ'^ = 0} 

where is the transposed matrix of H. We then say that iJ is a parity-check 
matrix of C. The proofs of the following results are developed by Carlet, Charpin 
and Zinoviev in 

Theorem 1. Let f be a function from ¥ 2 ^ into F 2 m with /(O) = 0. Let Cf be 
the linear binary code of length 2™ — 1 defined by the 2m x (2™ — l)-parity-check 
matrix 

/I a ... \ 

[f{l)f{a)f{a^)...f{a^”'-^))^ 

where each entry is viewed as a binary column vector of length m and a is a 
primitive element o/F 2 m. Then 

(i) Xf = 2"*“^ if and only dim Cf > 2™ — 1 — 2m or Cj' contains the all-one 
vector. 

(ii) If dim C/ = 2"* - 1 - 2m, 

Xf = max \2'^~^ — wt{c)\ . 

In particular, for odd m, f is AB if and only if for any non-zero codeword 
cGCf, 

2^-1 _ 2 -p 2^ . 
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(in) f is APN if and only if the code Cf has minimum distance 5. 

Tables Q] (resp. 0 give all known and conjectured values of exponents s (up 
to equivalence) such that the power function x i— >■ x® is APN (resp. AB). AB 
power permutations also correspond to pairs of maximum-length sequences with 
preferred crosscorrelation I2dl . 



Table 1. Known and conjectured APN power functions x® on F 2 "* with m = 2t + I 





exponents s 


status 


quadratic functions 


2* -I- 1 with gcd(i, m) = 1 and 1 < * < t 


proven [lumqj 


Kasami’s functions 


2'^® _ 2 * -I- 1 with gcd(i, m) — 1 and 2 < i <t 


proven 


inverse function 


2^t _ 


proven I19IH 


Welch’s function 


2‘-t3 


proven [2j 


Niho’s function 


2* -1- 2i — 1 if t is even 
2* -I- 2 2 — 1 if t is odd 


proven 

n 


Dobbertin’s function 


2-4* + 2 ^® -t 2''® -t 2® - 1 if m = 5i 


conjectured |H] 



Table 2. Known AB power permutations x“ on F 2 '" with m = 2t + 1 





exponents s 


status 


quadratic functions 


2® -I- 1 with gcd(i, m) = 1 and 1 < * < t 


proven I1()I19| 


Kasami’s functions 


2®’® — 2® -I- 1 with gcd(i, m) = 1 and 2 < i <t 


proven (14j 


Welch’s function 


2‘-t3 


proven I4I3I 


Niho’s function 


2* -1- 23 — 1 if t is even 
2* -I- 2 2 — 1 if t is odd 


proven 

HU 



2.2 Weight Divisibility of Cyclic Codes 

We now give some properties of binary cyclic codes since the linear code Cf 
associated to a power function / : x 1— >■ x® on F2m is a binary cyclic code of 
length (2"* — 1) with two zeros. We especially focus on the weight divisibility of 
the duals of such codes. 

Definition 1. A linear binary code C of length n is cyclic if for all codewords 
(cq, . . . , c„_i) in C, the vector (cn-i, cq, . . . , Cn- 2 ) is also in C. 

If each vector (cq, . . . , c„_i) € F2 is associated with the polynomial c{X) = 
™ = F2 [A"]/(A”— 1), any binary cyclic code of length n is an ideal 

of TZn- Since Tin is a principal domain, any cyclic code C of length n is generated 
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by a unique monic polynomial g having minimal degree. This polynomial is 
called the generator polynomial of the code and its roots are the zeros of C. The 
defining set of C is then the set 

I{C) = {i G {0, • • • , 2™ — 2}| a* is a zero of C} . 

where a is a primitive element of ¥ 2 ^. Since C is a binary code, its defining 
set is a union of 2-cyclotomic cosets modulo (2™ — 1), Cl{a), where Cl{a) = 
{2^ a mod (2™ — 1)}. From now on the defining set of a binary cyclic code of 
length (2"* — 1) is identified with the representatives of the corresponding 2- 
cyclotomic cosets modulo (2"* — 1). The linear code C/ associated to the power 
function f : x 1 -^ x'^ on F 2 ^ is defined by the following parity-check matrix: 

„ _ / 1 a \ 

^ 1 Of'* J 

It then consists of all binary vectors c of length (2™ — 1) such that cHj = 0, i.e. 

2”*-2 2"*-2 

c{a) = ^ CiO;* = 0 and c(a®) = ^ CiO;*® = 0 . 

i=0 i=0 

The code C/ is therefore the binary cyclic code of length (2™ — 1) with defining 
set {1, s}. 

Definition 2. A binary code C is said 2^-divisible if the weight of any of its co- 
dewords is divisible by 2^. Moreover C is said exactly 2^-divisible if, additionally, 
it contains at least one codeword whose weight is not divisible by 2^'^'^ . 

The following theorem due to McEliece reduces the determination of the 
exact weight divisibility of binary cyclic codes to a combinatorial problem: 

Theorem 2. H2F A binary cyclic code is exactly 2^ -divisible if and only if i is 
the smallest number such that (£ -\- 1) nonzeros of C (with repetitions allowed) 
have product 1. 

We now focus on primitive cyclic codes with two zeros and on the exact 
weight divisibility of their duals. We denote by Ci^s the binary cyclic code of 
length (2™ — 1) with defining set CI{1) U Cl{s). The nonzeros of the cyclic 
code Ci g are the elements with i G CI{1) U Cl{s). Then {£ 1) nonzeros 

of have product 1 if and only if there exist Ii C Cl{s) and I 2 C Cl(l) with 
|A| + 1^2! = ^ + 1 and 

= l ^ /fc = 0 mod (2"* - 1) 

feG/iU /2 fcG/iU /2 

We consider both integers u and v defined by their 2-adic expansions: u = 
Ui2^ and v = YlYo where Ui = 1 if and only if 2®s mod (2"* — 1) G Ii 
and Vi = l \i and only if 2* mod (2™ — 1) G l 2 - Then we have 

771—1 771—1 

k = Ui2’‘s -b ^^2® mod (2™ — 1) = 0 mod (2™ — 1) 

fcG/iU /2 ^=0 1=0 
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The size of I\ (resp. I 2 ) corresponds to w^iu) = which is the 2-weight 

of u (resp. v). McEliece’s theorem can then be formulated as follows: 

Corollary 1. The cyclic code of length (2"* — 1) is exactly 2^ -divisible if 
and only if for all (u,v) such that 0 <u< 2 '" — 1 , 0 <u< 2 ’" — 1 and 

us -\-v = 0 mod ( 2 ™ — 1 ), 

we have W 2 {u) W 2 {v) > 

Since v < 2”^ — 1, the condition us -|- u = 0 mod (2™ — 1) can be written v = 
(2™ — 1) — {us mod (2"* — 1)). This leads to the following equivalent formulation: 

Corollary 2. The cyclic code of length (2"* — 1) is exactly 2^-divisible if 
and only if for all u such that 0 < tt < 2 ™ — 1 , 

W2{A{u)) < W2{u) m — 1 — £ 

where A{u) = us mod ( 2 ™ — 1 ) . 

3 Characterization of Almost Bent Functions 

As previously seen the nonlinearity of a function from F 2 ™ into F 2 m is related 
to the weight distributions of some linear binary codes of length ( 2 ™ — 1 ) and 
dimension 2m. We here give some general results on the weight distributions 
of linear codes having these parameters. Our method uses Pless power moment 
identities m and some ideas due to Kasami [El th. 13] (see also [3 th. 4]). 
The weight enumerator of a linear code C of length n is the vector (Aq, . . . , A„) 
where Ai is the number of codewords of weight i in C. 

Theorem 3. Let C be a [2"* — 1, 2™ — 2m — 1] linear binary code with minimum 
distance d > 3. Assume that the dual code does not contain the all-one 
vector 1 = (1, • • • , !)• Let A = (Aq, • • • , A 2 m_i) (resp. B = {Bq, • • • , i? 2 "*-i ) ) be 
the weight enumerator of (resp. C). Then we have 

(i) If Wo is such that A^, = A 2 ™_i„ = 0 for all 0 < w < wq, then 

6(^3 + Bi) < ( 2 ”* - 1 ) [( 2 ™-i - - 2 ™-i] 

where equality holds if and only if A^j = 0 for allw ^ { 0 , wo, 2 ™“^, 2 ™ — icq}. 

(ii) If Wi is such that A^ = A 2 m_.^j^ = 0 for all wi < w < 2"*“^, then 

6(^3 B 4 ) > ( 2 ”* - 1 ) [( 2 ™-i - - 2 ™-i] 

where equality holds if and only if Ayj = 0 for all w ^ { 0 , Wi, 2 ™“^, 2 ™ — ici}. 
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Proof. The main part of the proof relies on the first Pless power moment iden- 
tities m- The first four power moment identities on the weight distribution of 
the [ 2 ™ — l, 2 m]-code are: 

n 

^wA„ = 22 ™- 1 ( 2 ™- 1 ), 

w—0 

n 

^w 2 A„ = 23 ™- 2 ( 2 ™- 1 ), 

w—0 

n 

= 22^-3 ((2’" - 1)2(2™ -b 2) - 31^3) , 

W—0 

n 

Y = 22™-4 ( 2 ™ ( 2 ™ - 1 )( 22 ™ -b 3 • 2 ™ - 6 ) -b 4 ! (^4 - ( 2 ™ - 1)^3)) 

W—0 

Let us consider the numbers Ii = ~ 2 '^~^YA^. Since for £ even 

{w - 2 ™-i)^ = (( 2 ™ -w)- 2 ™-i)^ , 
we have for any even £: 

2 ”‘-l 

^ (u;- 2 ™-i)^A^= ^ (re - 2 ™-i)^(A„ + . 

W—1 W—1 

Note that the codeword of weight zero is not taken in account in the sum above. 
Recall that does not contain the all-one codeword. By using the four power 
moments, we obtain the following values for I2 and J4: 

/2 = 22™-2(2™ - 1) 

/4 = 22™-2 [6(^3 -b B4) + 2 ™- 3 ( 2 ™ - 1 )] 

This implies 

I{x)=h-X^h= Y {{w- 2 -^-^f -x'^){A^ + A2m_^) 

W — 1 

= 22™-2 [6(53 + 54) -b (2™ - 1 )( 2™-3 - x^)] 

The w-th term in this sum satisfies: 

{w - 2™-1)2 ({w - 2™-1)2 - x 2) < 0 if 0 < |2™-i - w\ < x 

= 0 if w e {2™-\2™-i ±4 
> 0 if |2™-i -w| > X 

This implies that, if = bl2m_^ = 0 for all w such that 0 < u> < u>o, all 
the terms in I(2™“3 _ ^re negative. Then we have 

6(53 + B4) + (2™ - 1) [2™-i - ( 2™-3 - wo)2] < 0 
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with equality if and only if all terms in the sum are zero. This can only occur 
when = 0 for all w ^ {0, icq, 2™“^, 2”^ — wq}- 

Similarly, if Ayj = A2m_^ = 0 for all w such that Wi < w < 2™“^, all the 
terms in 1(2™“^ — wi) are positive. Then we have 

6(^3 + Bi) + (2™ - 1) [2™-i - (2’"-^ - «;i)2] > 0 

with equality if and only if all terms in the sum are zero, i.e. if A^ = 0 for all 
u; ^ {0,wi,2™-i,2'" - wi}. o 

Let us now suppose that m is odd, m = 2t + 1. We give a necessary and 
sufficient condition on / : ¥2^ — >• ¥2^ to be almost bent. 

Theorem 4. Let m be an odd integer and let f be a funetion from ¥2^ into ¥2^ 
such that Xf yf 2"*“^. Then f is AB if and only if f is APN and the code Cj' 

defined in Theorem^is 2~5~ -divisible. 

Proof. Let (Aq, • • • , A 2m_i) (resp. {Bq, ■ ■ ■ , B 2^-i)) be the weight enumerator 
of (resp. Cf) and let wq be the smallest w such that 0 < ic < 2"*“^ and 
Aw + A2m_yj yf 0 for all 0 < w < Wq. According to Theorem^ (ii), f is AB if and 
only if Wo = 2"*“^ — 2“2~ . Since A/ yf 2™“^, we deduce from Theorem Cl (i) that 
the code C / has dimension 2™ — 2m — 1 and that Cj^ does not contain the all-one 
vector. Since the minimum distance of C/ is obviously greater than 3, Theorem 0 
can be applied. The announced condition is sufficient: if wq = 2"*“^ — 2^~ we 
have that B3 B4 = 0 according to Theorem 0 (i). This means that C/ has 
minimum distance 5 {i.e. f is APN). Moreover all nonzero weights of Cj~ lie in 

{2m-i, 2™-i 2“2“ }. The code Cj- is therefore 2^~ -divisible. 

The condition is also necessary since, for any w such that 2™“^ — 2^“ < w < 
2m- 1, integers w and 2™“^ — w are not divisible by 2^~ . The condition 
on the divisibility of the weights of Cj- then implies that A^ A2m_u, = 0 for 

all w such that 2™“^ — 2 “2— < w < 2™“^. If / is APN, Cf does not contain any 
codeword of weight 3 and 4. The lower bound given in Theorem 0 (ii) (applied 
with wi = 2™“^ — 2 ^~ ) is then reached. It follows that the weight of every 
codeword in Cj^ lies in {0, 2™“^, 2"*“^ ± 2^~ } and therefore that / is AB. o 

When / is a power function, / : a; 1— >■ a:®, the corresponding code C/ is the 
binary cyclic code C\^s of length (2™ — 1) with defining set {l,s}. The weight 
divisibility of the corresponding dual code can therefore be obtained by applying 
McEliece’s theorem, as expressed in Corollary 0 This leads to the following 
characterization of AB power functions: 

Corollary 3. Let m = 2t + 1. Assume that the power function / : a: 1— >■ a:® on 
¥2^ has no affine component. Then f is AB on ¥2^ if and only if f is APN 
on F2m and 

Vm, 1<m<2™- 
where A{u) = us mod (2™ — 1) . 



1, W2{A{u)) < t + W2{u) 



(2) 
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Condition is obviously satisfied when W 2 {u) > t+ 1. Moreover, if gcd(s, 2™ — 
1) = 1 (z.e. if X I— > cc® is a permutation), the condition also holds for all u such 
that W 2 {u) = t. Using that 

A{u2^ mod (2™ - 1)) = 2M(u) mod (2™ - 1), 

we deduce that Condition (0 must only be checked for one element in each 
cyclotomic coset. Note that if u is the smallest element in its cyclotomic coset 
and W 2 {u) < t, we have u < 2"*“^ — 1. This result provides a fast algorithm for 
checking whether an APN power function is AB, and then for finding all AB 
power functions on ¥ 2 ^. There are roughly ^ ^ cyclotomic representatives u 
such that W 2 {u) < t and each test requires one modular multiplication on m-bit 
integers and two weight computations. Condition (j2D can then be checked with 
around 2™ elementary operations and at no memory cost. 

The 2-weight of s obviously gives an upper bound on the weight divisibility 
of (obtained for w = 1 in Corollary |3) . Using this result, we immediately 
recover the condition on the degree of AB functions given in Theorem 1] in 
the particular case of power functions. 

Corollary 4. Let m be an odd integer. If the power permutation / : x 1 — >■ x® is 
AB on ¥ 2 ^, then 

degree(/) = ^ 2 ( 5 ) < . 



4 Power Functions tc 1 — >■ a;® on with s = 2 ^ +2* — 1 

In his 1968 paper El, Golomb mentioned a conjecture of Welch stating that for 
m = 2t + 1, the power function x 1 — >■ x® with s = 2* -b 3 is AB on ¥ 2 ^ . Niho m 
stated a similar conjecture for s = 2*4-25 — 1 when t is even and s = 2* + 2^~ — 1 
when t is odd. Note that all of these exponents s can be written as 2* -b 2® — 1 
for some i. Since both Welch’s and Niho’s functions are APN j9l8j . Corollary 0 
leads to the following formulation of Welch’s and Niho’s conjectures: 

Let m = 2t -b 1 be an odd integer. For all u such that 1 < rt < 2’" — 1, we have 

W2({2* -b 2* — l)u mod (2"* — 1)) < t + W 2 {u) (3) 

for the following values of i: i = 2, i = t/2 for even t and i = (3t-b l)/2 for odd t. 
We proved that Condition J3I) is satisfied in the Welch case {i = 2) [4|3j . More 
recently Xiang and Hollmann used this formulation for proving Niho’s conjec- 
ture H2|. We here focus on all other values of s which can be expressed as 
s = 2* -b 2® — 1 for some i. We prove that for almost all of these values x 1 — >■ x® 
is not AB on F 2 "* . This result is derived from both following lemmas which give 
an upper bound on the exact weight divisibility of C^g. 

Lemma 1. Let m = 2t+l be an odd integer and s = 2*-b2® — 1 with 2 < i < t—1. 
Let Ci^s be the binary eyelie eode of length (2™ — 1) with defining set {l,s}. If 
2^ denotes the exaet divisibility ofCfg, we have 
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— if t = 0 mod i and i ^ tj2, then (■ <t — 1, 

— ift=l mod i, then £ < t — i + 2, 

— if t = r mod i with 1 < r < i, then £ < t — i + r. 

Proof. Let t = iq+r with r < i and A(u) = (2*+2* — l)u mod 2™ — 1. McEliece’s 
theorem (Corollary EJ implies that Cj*~g is at most 2^-divisible if there exists an 
integer u € {0, . . . , 2"* — 1} such that W 2 {A{u)) = W 2 {u) + 2t — £. We here exhibit 
an integer u satisfying this condition for the announced values of £. 

— We first consider the case r yf 0. Let u = 2* + 2’'“^ 2®^ + 1. Then 

W 2 (u) = q + 2 and we have 

9 

A{u) = 2^* + 2*+'-! + (2*+*-i - 2*+'-!) + (2* - 1) . (4) 

fc=i 

If r > 1, we have t + i < t + r — 1 + ifc < 2t — 1 for all fc such that 1 < k < q. 
All terms in 0 are then distinct. It follows that 

W 2 {A{u)) = l + g+ l + (t — r) + * = W 2 {u) + t — r + i . 

If r = 1, we obtain 

9 

A{u) = 2^‘ + 2‘ ^ 2*^= + 2‘+*+i + 2*+*-^ - I . 

fc=2 

In this case 

W2{A{u)) = I + (q — l) + I + (t + j — I)= W2{u) + t + i — 2 . 

— Suppose now that r = 0 and i yf t/2. Since i < t, we have q > 2. Let 

u = 2‘+* + 2*+2 + 2* + 2*+2 2*^= + 1. Using that i > 2, we deduce that 

i + 2 + ik < i{q — 1) + 2 < t — f + 2 < t for all fc < q — 2. It follows that 
W 2 {u) = q + 3. Let us now expand the corresponding A(u): 

9-3 

A{u) = 2^‘ + ^ 2 *+ 2 +('=+ 2 )* _|_ 2*+^* + 2‘+*+3 - 2*+^ + 2* + 2*“^ + 1 . (5) 

k=0 

If j > 2, all values of k such that 0 < k < q— 3 satisfy t+2i < t+2+{k+2)i < 
2t. We then deduce that, if q > 2, all the terms in 0 are distinct except if 
j = 3. It follows that, for any f > 3, 

rc 2 (A('u)) = l + (q — 2) + l + (t+l)+3 = W 2 {u) -t- t -t- 1 . 

For i = 3, we have 

9-3 

A{u) = 2^‘ + ^ 2*+3fc+8 2*+'^ - 2® + 2^ + 2^ + 1 . 

In this case 



W 2 {A(^u)) — 1 + (g — 2) + (t + 2) + 3 — w(^u) + t + 1 . 



o 
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Lemma 2. Let m = 2t+l be an odd integer s = 2* + 2* — 1 with t+1 < i < 2t. 
Let Ci^s be the binary eyelie eode of length (2™ — 1) with defining set {l,s}. If 
2^ denotes the exact divisibility ofC^^, we have 

— ift+l<i< then £ < m — i, 

— if < i < 2t — 1, then I < 2{m — i) — 1, 

— if i = 2t — 1, then ^ < 3. 

Proof. Let A{u) = (2* + 2® — 1)m mod (2™ — 1). Exactly as in the proof of the 
previous lemma, we exhibit an integer u G {0, . . . , 2™ — 1} such that W 2 {A{u)) = 
W 2 {u) + 2t — £ for the announced values of £. We write i = t + j where 1 < j < t. 

— We first consider the case t + 1 < i < Let rt = 2* + 2^~^ + 1. Then 

W 2 {u) = 3 and 

A{u) = 2^‘ + + 2‘+^‘ + 2‘+^-i - 1 . (6) 

Since j < Lti, we have that 2t > t + 2j — 1. All the terms in (0 are therefore 
distinct. We deduce 

W2{A{u)) = 3 + (f + j — 1) = W2{u) +i — l. 

— We now focus on the case < z < 2t — 1. Let u = 2* + 2^ + 1. Then 
W 2 {u) = 3 and 

A{u) = 2^* + 2*+^'+i - 2^-1 + - 1 . (7) 

Since ^ < j < t, we have 0 < 2 j — t — 1 < j — 1. If j t — 1, all the 
exponents in o are distinct. It follows that 

W2{A{u)) = I + (t + 2) + {2j — t — 1) = W2{u) + 2(z — t) — 1 . 

li j = t — 1, we have 

A{u) = 2^*+! - 2^-1 + 22^-*-^ - 1 . 



In this case 



W2{A{u)) = (2t + I) — (t — j) = W2{u) + 2t — 3 . 



o 



From both lemma ^ and|2|we deduce the following theorem: 

Theorem 5. Let m = 2t + 1 be an odd integer and fet s = 2* + 2® — 1 with i G 
2t}. The only values ofi such that x is AB on ¥ 2 ^ are 1, 2, t, t + 

I, 2t and maybe t—1. 

Proof If z ^ {1,2, — I,t, t + I, is not 2‘-divisible since the 

upper bounds given in both previous lemmas are strictly less than t. It follows 
from Theorem 0 that the corresponding power functions are not AB. Moreover 
X !->■ a;®* is AB for z G (1, 2, |, t, f + 1, ^^^A,2t}: 
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— 1=1 corresponds to a quadratic function, 

— 1 = 2 corresponds to the Welch’s function, 

— i = t corresponds to the inverse of a quadratic function since (2*+^ — 1)(2* + 
1) = 2* mod 2™ - 1. 

— i = t + 1 corresponds to a Kasami’s function since 2* (2*+^ + 2* — 1) = 
224 _ 2‘ + 1 mod 2™ - 1. 

— i = 2t gives an s which is in the same 2-cyclotomic coset as 2*+^ — 1. 

— j = I or 4 = corresponds to the Niho’s function. o 

The only unresolved case is then i = t—1. In accordance with our simulation 
results for to < 39 we conjecture that the dual of the binary cyclic code of 
length (2™ — 1) with defining set {1,2* + 2*“^ — 1} is exactly 2*-divisible. For 
TO = 5 and to = 7 the function a; i— >■ a;® for s = 2* + 2*“^ — 1 is AB since it 
respectively corresponds to a quadratic function and to the Welch function. On 
the contrary it is known that this power function is not APN when 3 divides to 
since has minimum distance 3 in this case 0 Th. 5]. We actually conjecture 
that for any odd to > 9 the function a: i— >■ a:® with s = 2* + 2*“^ — 1 is not APN 
on F 2 ">. 



5 AB Power Functions on F 2 m when m Is Not a Prime 



We now focus on AB power functions on ¥ 2 ^ when to is not a prime. We show 
that in this case the nonlinearity of a; 1 — >■ a;® on F 2 m is closely related to the 
nonlinearity of the power x 1 — > a;®” on F 2 s where g is a divisor of to and sq = 
s mod (2® — l).We first derive an upper bound on the exact weight divisibility 
of from the exact weight divisibility of the code of length (2® — 1). 

Proposition 3. Let g he a divisor of m. Let C\^s be the binary eyclie eode 
of length (2"* — 1) with defining set |l,s} and Cq the binary cyclic code of 
length (2® — 1) with defining set |l,so} where sq = s mod (2® — 1). Assume 
that Cq is exactly 2^ -divisible. Then is not 2~a -divisible. 

Proof. Let s = sq + a(2® — 1). We here use McEliece’s theorem as expressed in 
Corollary ni If Cq is exactly 2^-divisible, there exists a pair of integers {uo,vq) 
with Mo < 2® — 1 and mq < 2® — 1 such that 

MqSo + Mq = 0 mod 2® — 1 and W 2 {uq) + W 2 {vq) = £ + 1 



Let us now consider both integers u and v defined by 

2^m ^ 2^m ^ 

u = uq and v = vq - 

2 ® - 1 2 ® - 1 

For s = So + a(2® — 1), the pair (m, v) satisfies 



2 ™ — 1 

MS + M = Moa(2™ — 1) + -X (moSo + Mo) = 0 mod (2™ — 1) . 



2 ® - 1 
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Since \g_i = ^ 2*® and both uq and vq are less than 2® — 1, we have 

TTl TTl 

W2{u) + W2{v) = — {w2{uo) + W2 {vq)) = —{£+!). 

9 9 

We then deduce that is not 2 "s' -divisible. o 

We now derive a necessary condition on the values of the exponents which 
provide AB power functions. 

Theorem 6. Let m be an odd integer. The power funetion x is not AB 

on F 2 m if there exists a divisor g of m with g > 1 satisfying one of the following 
conditions: 



1. 3i, 0 < i < g, s = 2® mod (2® — 1), 

2. So = s mod (2® — 1) 2* and the dual of the cyclic code of length (2® — 1) 

with defining set {l,so} is not 2^^ -divisible. 

Proof. Theorem El provide a necessary condition for obtaining an AB power 
function on F 2 "i: this function has to be APN and Cig has to be 2 “ 2 ~ -divisible. 
When s = 2® mod (2® — 1), it is known |Z] that the cyclic code Ci^g has minimum 
distance 3. It follows that a; i— >■ a:® is not APN in this case. Suppose now that 
the dual of the cyclic code of length (2® — 1) with defining set {1, sq} is exactly 
2^-divisible. According to the previous theorem we have that C^g is not 
divisible. If C^g is 2 “ 2 “ -divisible, it therefore follows that 



This gives 



2 9 



^ ^ g(m-bl) ^ g- 1 
“ 2m 2 



since {m-\- l)g > m{g — 1). This implies that is 2 **2 -divisible. 



o 



Example 1. We search for all AB power permutations on F 221 . We here use 
that the cyclic codes C^g^ of length (2^ — 1) are at most 4-divisible when 
So G {7,19,21,31,47,55,63} (and for their cyclotomic conjugates). Amongst 
the 42340 possible pairs of exponents (s,s“^) such that gcd(s,2^^ — 1) = 1 (up 
to equivalence), only 5520 satisfy both conditions expressed in Corollary 0 and 
Theorem El By testing the weight divisibility of the corresponding cyclic codes 
as described in Corollary 0 we obtain that only 20 such pairs correspond to a 
2^°-divisible code C^g. The corresponding values of min(s, s“^) are: 

(3, 5, 13, 17, 33, 241, 257, 993, 1025, 1027, 1055, 3071, 8447} 

U (171, 16259, 31729, 49789, 52429, 123423, 146312} . 

The exponents lying in the first set are known to provide AB functions (see 
Table 0 ■ We finally check that the power functions corresponding to the second 
set of exponents are not APN. 
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We now exhibit another family of power functions which are not AB: 

Proposition 4. Let m he an odd integer. If there exists a divisor g of m such 
that s satisfies 



2m ^ 2^ \ \ 

s = — Sn mod with 0 < Sn < and tc^fso) < - 

25-1 29-1 u; _ 2 

then the power function x >—>■ is not AB on ¥ 2 ^. 




Proof. If the power function a; 1 — >■ is AB on F 2 "* , we have that the dual of the 

cyclic code Ci^s of length (2™ — 1) with defining set {1, s} is 2^^ -divisible. We 
here use McEliece’s theorem as formulated in Corollary |21 Let it = 2® — 1. Then 
we have 

A{u) = us mod 2™ - 1 = (2™ - 1) - { 2 ^ - 1)sq ■ 

We obtain that W 2 {A{u)) = m — W2{if2^ — l)so)- Since iC2((2® — l)so) < gw 2 {so), 
this implies that 



iC 2 (A(ii)) >m — gw 2 {so) 



> W 2 {u) + m 



5(^2 (so) + 1) > W 2 {u) + 



m — 1 



2 



when 102 ( 50 ) < i — 3^ It follows that Cj*-g is not 2 2 -divisible. o 

The third author conjectured that for m = 5g the function a; 1 — >■ a;® with s = 
2^9 _|_ 2^9 _|_ 2^9 _|_ 2 ® _ 1 is APN on F 2 "i 0. The previous corollary implies: 

Proposition 5. Let m he an odd integer such that m = 5g. The power function 
X I— >■ a;'* with s = 2^® -|- 2^® -|- 2^® -|- 2® — 1 is not AB on F 2 m . 

Proof. Since s = \gZi ~ 2, we apply the previous corollary with sq = 2 and 
mfg = 5 using that 

/ ^ ^ f rn 

«^2(so) = 1=2 (--3 

o 
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Abstract. An iterated block cipher can be regarded as a means of pro- 
ducing a set of permutations of a message space. Some properties of the 
group generated by the round functions of such a cipher are known to be 
of cryptanalytic interest. It is shown here that if this group acts impri- 
mitively on the message space then there is an exploitable weakness in 
the cipher. It is demonstrated that a weakness of this type can be used 
to construct a trapdoor that may be difficult to detect. An example of 
a DES-like cipher, resistant to both linear and differential cryptanalysis 
that generates an imprimitive group and is easily broken, is given. Some 
implications for block cipher design are noted. 



1 Introduction 

An iterated block cipher can be regarded as a means of producing a set of 
permutations of a message space by the repetition of simpler round functions. 
Properties of the groups generated by the round functions and by the actual 
encryptions of such a cipher have long been recognised as having cryptographic 
importance. For example, if either of these groups is “small” in size then the 
cipher may be regarded as having a weakness, since not every possible permuta- 
tion of the message space can be realised by the cipher, 0B|- Moreover, multiple 
encryption may offer little or no additional security if these groups are small. 
Attacks on ciphers whose encryptions generate small groups were given in m- 
Naturally, much attention has been devoted to groups associated with the 
DES algorithm. Early studies in |S| and |S| concentrated on the groups generated 
by a set of “DES-like functions”, of which the actual round functions of DES 
form a subset. It was shown that these functions can generate the alternating 
group, a desirable property. Further work on this theme can be found in j‘26j . In 
PPj it was shown that the actual round functions of DES generate the alternating 
group. The question of whether the 2®® encryptions of the full DES algorithm 
themselves form a group, or generate a small group (see USES]), was answered 
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Exchange Programme and the Swiss National Science Foundation, and was per- 
formed whilst the author was visiting ETH Zurich. 
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in the negative in 0 and a lower bound of was obtained in ^ for the size 

of this generated group. Thus the attacks of m are not applicable to DES. 

However the ability of a cipher (or its round functions) to generate a large 
group does not alone guarantee security: an example of a weak cipher generating 
the symmetric group on the message space was given in [2S|- The most that can 
be said is that a small group may lead to an insecurity. 

Here we examine properties of the groups related to a block cipher more 
refined than simply their size. Consider the following statement of Wernsdorf 
regarding the group generated by the round functions of DES: 

“Since the generated alternating group A 264 . is a large simple group and 
primitive on Ve 4 [the message space] we can exclude several imaginable 
cryptanalytic ‘shortcuts’ of the DES algorithm.” 

In the next section we will formalise our discussion of the groups associated 
with iterated block ciphers and sketch the theory of primitive and imprimitive 
groups. Next, motivated by Wernsdorf’s statement, we examine attacks on itera- 
ted block ciphers whose round functions generate imprimitive groups. Then we 
argue that these imprimitivity-based attacks enable a designer to build trapdo- 
ors into iterated block ciphers. We give an example of a 64-bit DES-like cipher 
having 32 rounds and an 80-bit key which is resistant to basic linear and diffe- 
rential cryptanalytic attacks but whose security is severely compromised by such 
an attack using 2^® chosen plaintexts. With a careful (and deliberately weak) 
choice of key-schedule and knowledge of the trapdoor, the cipher can be comple- 
tely broken using only a few known plaintexts and 2^^ trial encryptions. While 
the trapdoor in our example is not so well disguised, it can easily be made un- 
detectable if the cipher design is not made public. We conclude by giving some 
implications of our work and ideas for future research. 

We mention here the recent work of m in which block ciphers containing 
partial trapdoors are constructed: these give only partial information about keys 
and require rather large S-box components to be present in the cipher. Know- 
ledge of the trapdoor allows an efficient attack based on linear cryptanalysis m 
Unfortunately, the work of shows that these trapdoors are either easily de- 
tected or yield only attacks requiring infeasible numbers of plaintext /ciphertext 
pairs. In contrast, our trapdoor can be inserted into a block cipher with very 
small S-boxes, reveals the entire key but is also detectable. In the language of 
m, it is a full, but detectable, trapdoor. It is a moot point whether trapdo- 
ors that are both full and undetectable can be inserted in truly practical block 
ciphers. 



2 Iterated Block Ciphers and Their Groups 

We begin by describing a model for iterated block ciphers. We will regard such a 
cipher as a set of invertible encryption functions mapping a set M, the message 
space, to itself, or equivalently as a subset of the symmetric group on M , denoted 
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Sm- We can then use notions from the theory of permutation groups to study 
such ciphers. The necessary algebraic background can be found in m or m- 
The encryption functions of a particular iterated block cipher are obtained 
by the composition of round functions, that is, a set of keyed invertible functions 
on M, which we denote by {Rk : M — >■ M, k G K} . Here K is called the round 
keyspace and k a round key. In a t-round iterated block cipher, the encryption 
functions take the form 



Rki,...,kt — Rk\Rk2 ' ' ' ^kt 

where the ki may be derived from a key from a (larger) session keyspace accor- 
ding to some key-scheduling algorithm, or may be independently chosen. So, the 
encryption of plaintext m under round keys k\,k2, ■ ■ ■ ,kt is 

= mRk^Rk2 ■ ■ ■ Rkt 

(for the moment we denote all functions as acting on the right of their arguments, 
so that in a composition, functions are evaluated from left to right). 

We write G =< Rk ■ k G K > for the group generated by the round functions, 
that is, the smallest subgroup of Sm containing each Rk- Similarly we write 
Gt =< Rki ■ ■ ■ Rkt ■ ki G K > for the subgroup of Sm generated by the t-round 
encryptions with independent round keys. We say that G and the Gt act on 
the message space M . The groups Gt are hard to compute in practice, but we 
have the following result relating them to the group G generated by the round 
functions: 

Theorem 1 (fl2j). With notation as above, Gt is a normal subgroup of G. 
Moreover the group generated by the t-round encryptions with round keys from 
a particular key-schedule is a subgroup of Gt ■ 

Example 1. DES (described in full in P|) is essentially an iterated block cipher 
with t = 16 rounds, message space M = Ve4, the vector-space of dimension 64 
over Z2, and round keyspace K = V48. The form taken by the round functions 
Rk of DES is: 

mRk = {I, r)Rk = {r,l® f{r, k)) 

where l,r G V32 denote the left and right halves of message m and / : V32 x V48 — >■ 
V32. The group G generated by the round functions of DES is known to be the 
alternating group on Ve4, denoted H264, ^0]. Since G is simple and Gie is normal 
in G, the group generated by DES with independent round keys is also A264. 
The group generated by DES itself (with key-schedule as defined in 0) is not 
known. 

We will follow the exposition of HP. Sections 6 and 7 on imprimitive groups. 
Our presentation is necessarily compressed. 

Let G be a group of permutations acting on a set M (the reader can imagine 
G and M to be as above) . A subset Y of M is said to be a block of G if for every 

9 ^ G, 



either Yg = Y or YgAY = %. 
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Here Yg denotes the set {yg : y £ Y}. The sets M, 0 and the singletons {y} 
are blocks of every G acting on M. These are called the trivial blocks. The 
intersection of two blocks of G is also a block. 

If y is a block of G, then so \sYg for every g £ G. The set of distinct blocks 
obtained from a block Y in this way is called a complete block system. All blocks 
of such a system have the same size and if G is transitive on M, then every 
element of M lies in a block of the system. Thus, in this case, the blocks form a 
partition of M into disjoint sets of equal size. 

Suppose now that G is transitive. Then G is said to be imprimitive (or act 
imprimitively) if there is at least one non-trivial block Y . We will then refer to 
a complete non-trivial block system. Otherwise, G is said to be primitive. 

Let G act imprimitively on a finite set M and let F be a block of G, with 
|y| = s. Since G is transitive, there exist elements 1 = ti,T 2 , . . . ,Tr £ G such 
that the sets 

Fl =Yn=Y,Y 2 =YT 2 ,...,Yr = Yrr 

form a complete non-trivial block system. Here, \M\ = rs. Thus, for every g £ G, 
there exists a permutation g of {1,2, ... ,r} such that 

Y,g = Y,g. 

The set of g form a permutation group G on {1, 2, . . . , r} and the map g — >■ g is 
a group homomorphism from G onto G. 

3 Attacks Based on Imprimitivity 

Suppose the group G generated by the round functions : M — >• M of a Ground 
cipher acts imprimitively on M, and let Fi , . . . , Ij. be a complete non-trivial block 
system for G. Suppose further that, given m £ M, there is a description of the 
blocks such that it is easy to compute the i with m £ Yt and that round keys 
ki, . . . ,kt are in use. 

Our basic attack is a chosen-plaintext attack whose success is independent 
of the number t of rounds in use. 

3.1 Basic Attack 

Suppose that we choose one plaintext in each set Yi and obtain the corre- 
sponding ciphertext c^. Then the effect of g = . . . Rk^ on the blocks Yi is 

determined. For by the imprimitivity of G, 

Ci — rriig £ Yj Yig — 

Now given any further ciphertext c, we compute I such that c £ Yi. Then the 
plaintext m corresponding to c satisfies m G Yig-i. Thus r chosen plaintexts 
determine that the message corresponding to any ciphertext must lie in a set of 
size Hence the security of the system is severely compromised. The plaintext 
m itself can be found by examining the set of meaningful messages in Ffcg-i . 
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Alternatively, the basic attack determines the permutation g oi G correspon- 
ding to g: we can think of {1, . . . , r} as being the message space of a new cipher 
(where the encryption of i is ig for round keys ki, ... ,kt) and regard our basic 
attack as simply obtaining all the plaintext/ciphertext pairs for a fixed set of 
round keys. 

3.2 Key-Schedule Dependent Attack 

Every choice of round keys fci , . . . , fct determines a corresponding permutation g 
of {1, 2, . . . , r}. It is conceivable that there is an attack on the new cipher more 
efficient than exhaustively obtaining all the ciphertexts. Ideally such an attack 
would also obtain key information. As an important example, the round keys 
may be derived from a session key in such a way that g is wholely determined by 
only a part of the session key information. In practice, this information might 
take the form of the values of certain bits of the session key, or the value of linear 
expressions involving session key bits. We can think of g as being determined by 
keys from a reduced keyspace. Then it may be feasible to carry out an exhaustive 
search of the reduced keyspace using only a few known plaintext /ciphertext pairs 
to determine a unique reduced key. Given such session key information, it may 
then be possible to deduce the complete session key by another exhaustive search. 
We have a divide-and-conquer attack on the session key. 

This latter attack is then closely related to the attacks of m and 0 on 
ciphers whose round functions possess linear factors and linear structures res- 
pectively. For example, when M = Vn and the Yi consist of a linear subspace U 
of Vn and its cosets, we have a special type of linear factor (as described in m 
where the plaintext and ciphertext maps are equal and map coset Yi = U + ai 
to Qi. 

3.3 Multiple Block System Attack 

In an extension of the basic attack, we make use of two or more complete non- 
trivial block systems. 

Example 2. Using the notation of Example 1, we define an / function as follows: 
we divide the input r to the / function into two halves ri , T 2 G Vie and define 

f{r,k) = {fi{ri,k),f 2 {r 2 ,k)) 

where fi : Vie x AT — >■ Vie are arbitrary. It was shown in H2I that the fi can be 
chosen so that the iterated block cipher with round function {l,r)Rk = (r, Z 0 
/(r, k)) is secure against linear and differential cryptanalysis. We model an attack 
based on two complete systems of imprimitivity: we write elements of Ve 4 as 
(xi,X 2 ,xs,X 4 ) where Xi G Vie and define 2^^ sets of size 2^^: 

(^15^65^35^6)5 ^ 15^3 ^ ^6 

^{ x 2 , Xi ) (1^.65^25^165^4)5 X 2 ^ X 4 G Vl 6 ' 
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Notice that 

(^ 3 j ^ 16 i 63 /l(^ 3 i ^); ^{x3,xi(^fi(x3,k)) 

^{x 2 ,Xi)^k (^ 6 ? ^2 © ^2(3^45 ^)) ■^(3:4,0:20/2(3^4,^)) 

SO that the sets : x\,x^ G Viq} and : X2,X4 G Vie} form com- 

plete block systems for G, the group generated by the Rk- Moreover, for any 

Xl , X2 f X^ , 3 ^ 4 ; 

^{xi,X3) -^( 3 : 2 , ^ 4 ) = {(a;i,X2,a;3,a;4)} ■ 

Suppose we choose the 2 ^^ plaintexts of the form {xi,x\,x^, X3) and obtain their 
encryptions. From this information we can recover permutations and 52 
V16 X V16 such that for all Xi, 3:2, 2:3, CC4 

^( 3 : 1 , 3 : 3)5 = Y(xi,X3)gj^! Z(x2,X4)ff = Z{x 2 ,Xi)g^- 

Given any further ciphertext (01,02,03,04) with corresponding message m we 
have 

^ '^{ci,C3)g-^ ^(c2,C4)gJ^’ 

a set of size one. Thus m can be found uniquely. 

This attack is applicable to any cipher where the intersections of blocks from 
different systems can be computed and are “small” . 

4 A DES-Like Cipher with a Trapdoor 

Given the description of a set of round functions, it appears to be a difficult 
computational problem either to find a non-trivial complete block system for the 
corresponding group G or to disprove the existence of such a system. However 
the attacks above show that an iterated block cipher with an imprimitive group 
G is inherently weak if a complete block system is known. 

It appears then that using a set of round functions which generate an impri- 
mitive group (whose block system is not revealed) may lead to a block cipher 
containing a trapdoor that is difficult to detect. To give a convincing demon- 
stration of this, we should build a set of round functions according to recognised 
principles. The individual components should satisfy relevant design criteria and 
we should also demonstrate the security of our cipher against known attacks. 
This is our objective in this section. We give a full design for such a block ci- 
pher, except for a key-schedule. In the next section we will describe how our 
round functions were designed to generate an imprimitive group and how the 
cipher can be broken. 

4.1 Description of Round Function 

Perhaps the most commonly used template in the design of a block cipher is 
the Feistel construction. In turn the most celebrated Feistel-type cipher is DES 
itself. With reference to example 1 and 0 , the / function of DES consists of 
four components: we write /(r, k) = PS{E(r) © k) where 
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— the expansion phase, E, is a linear map from V32 to V 4 S, 

— fc is the 48-bit round key, derived from a 56 bit session key, 

— S denotes the operation of the S-boxes — eight carefully selected 6 bit to 4 
bit functions, numbered 1, . . . , 8 operating in parallel on V48, 

— P is a carefully selected bit permutation of V 32 ■ 

Our proposed block cipher consists of 32 repetitions of DES-like round fun- 
ctions: 

{l,r)Rk = {r,l® PS{E{r) © k)). 

Here E and P are as in the original DES, but the S-boxes are replaced by 
the boxes presented in the appendix. Our round keys k are also 48-bits and are 
derived from an 80-bit session key according to a key-scheduling algorithm which 
we leave unspecified. Any suitably strong schedule could be used (for example, 
we could expand the original DES schedule) . 

We note that the selection of S-boxes is critical to the security of DES. 
Numerous attacks have been made on versions of DES with modified S-boxes: 
see for example the early critique of DES in m the differential attacks on DES 
with modified S-boxes in |3] and the attack of m on the proposals of H3 

Each S-box in the appendix has the following properties, similar to those 
given in |S| for the DES S-boxes: 

51 Each S-box has six bits of input, four bits of output. 

52 The best linear approximation of an S-box (in the sense of equation 
(3)) holds with probability p over all inputs, where |p — 

53 Fixing the bits input to an S-box on the extreme left and on the extreme 
right at any two values, the resulting map from V4 to V4 is a permutation. 

54 If two inputs i,i' to an S-box differ in the pattern 000100 or 001000 (i.e. 
i(Bi' = 000100 or 001000), then the corresponding outputs differ in at least 
one position. 

55 If two inputs to an S-box differ in the pattern 001100, then the corre- 
sponding outputs differ in at least one position. 

56 If two inputs i,i' satisfy i®i' = llxyOO, where x and y are arbitrary bits, 
then the corresponding outputs differ in at least one position. 

57 For any non-zero input difference i ® i' not equal to one of those specified 
in S4, S5, the number of ordered pairs i,i' leading to a given non-zero ou- 
tput difference is at most 16. For the input differences in S4 and S5, the 
corresponding maximum is 24. 

58 For any non-zero input difference i © i' , the number of ordered pairs i, i' 
leading to an output difference of zero is at most 12. 

S2 guarantees that the S-boxes are not too linear, while S3 ensures they are 
balanced. S4-S6 can be regarded as weak avalanche criteria. Thus our S-boxes 
automatically have some desirable features. 

We also draw to the reader’s attention the properties PI to P3 of the P 
permutation noted in jSj. From left to right, we label the input bits to our S- 
boxes Pi,P 2 ,P 3 ,P 4 ,P 5 ,Pe and the output bits gi, <72, 9s, 94- We refer to bits ps 
and Pi as centre bits and bits Pi,P 2 ,P 5 ,P 6 as outer bits. 
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PI The four bits output from each S-box are distributed so that two of them 
affect centre bits, and the other two affect outer bits of S-boxes in the next 
round. 

P2 The four bits output from each S-box affect six different S-boxes in the next 
round, no two affect the same S-box. 

P3 For two S-boxes j, fc, if an output bit from S-box j affects a centre bit of 
S-box k, then an output bit from S-box k cannot affect a centre bit of S-box 

j- 



4.2 Security against Linear and Differential Attacks 

Here we estimate the resistance of our example to linear [ZH and differential m 
01 cryptanalysis. 

We begin by estimating the complexity of a linear attack. By property S2 
and Lemma 3 of m, the best linear expression that is built up round-by-round 
and involves input bits to round 2, output bits from round 31, key bits and a 
linear approximation in every round will hold with approximate probability 
where 

While a more delicate analysis may find linear characteristics not involving linear 
approximations in every round, it seems unlikely that these will have probability 
larger than the above bound on (since this bound is calculated using the 
highest per-round probability). We make the rough assumption that a linear 
attack using Algorithm 2 of fm would require at least 2®^ known plaintexts. 

The success of a basic differential attack depends on finding a high probabi- 
lity characteristic: a t-round characteristic having probability p is a sequence of 
differences 

Ami, Am2, ■ ■ ■ , Amt-i, Amt 

such that if Ami is the difference in plaintexts m ® m' input to the first ro- 
und, then the differences propagated to the inputs of subsequent rounds are 
Am 2 , ■ ■ ■ Amt with probability p, assuming independent round keys. In practice, 
at least a 29 round characteristic is needed to attack a 32 round iterated cipher. 
The number of plaintext input pairs required in a successful attack based on 
such a characteristic having probability p is at least ^ . Of particular importance 
are iterative characteristics where the output difference at the last round is equal 
to the initial input difference — such a characteristic can be concatenated with 
itself many times to form a longer characteristic. To provide practical security 
against a differential attack, we need to bound the probability of short iterative 
characteristics. For further details, see 0. 

We say that an S-box j is active in round i of a characteristic if Amt involves 
a non-zero input difference to S-box j. We can use properties S3 to S6, P2 and 
P3 and arguments similar to those of jS| to show the following for our cipher: 
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Lemma 1. If round i of a characteristic consists of two adjacent active S-boxes 
j, j + 1 then either round i — 1 or round i + 1 (or both) has at least one active S- 
box. If round i of a characteristic has only one active S-box j , then either round 
i — 1 or round i + 1 (or both) has at least one active S-box. 

A 29 round characteristic having no rounds without active S-boxes must involve 
a total of at least 29 active S-boxes. Using S7 and assuming independence, we can 
bound the probability of such a pattern by p < (||) = 2“^^. We have found 

characteristics with probability close to this, but omit the details. An attractive 
pattern of differences (used in 0 to attack DES) involves active S-boxes on even 
numbered rounds and no active S-boxes on odd numbered rounds. From the 
above lemma, the active rounds must involve at least a pattern of 3 adjacent 
S-boxes. By property S8, we can bound the probability of a 29 round pattern of 
this type by (g|) = 2“^°^. One further pattern of differences that we consider 

involves no active S-boxes on every third round. Using P3 and the lemma above, 
we can show that such a characteristic must involve 3 or more active S-boxes on 
the two active rounds. The probability of such a characteristic over 29 rounds 
is, using S7, at most 2“'^^. The analysis can be carried further, but it suffices 
to say that our cipher possesses a reasonable degree of resistance to differential 
cryptanalysis in its basic form. We note however that the our cipher is probably 
susceptible to more sophisticated attacks based on truncated m or impossible 
[II 912) differentials. 

5 Trapdoor Design 

Each S-box in the appendix has the following property: 

By property PI, the combination of P followed by E moves two of the 
four outputs of the S-box (say qi and qj) so as to affect centre bits of 
S-boxes in the next round. These two outputs are dependent on every 
input bit, while the other two outputs depend only on the outer bits 
Pi,P 2 ,P 5 ,Pe input to the S-box. 

For example, P moves output bit q^ of S-box 1 to position 23 in the output 
of the / function. After XORing with the left half and swopping, this position 
affects a centre bit, p 4 , of S-box 6 in the next round. Thus q^ depends on all six 
input bits to S-box 1. 

From the property above, it follows that the output bits of the / function in 
positions 1, 4, 5, 8, ... , 29, 32 depend only on round key bits and the / function 
inputs in the same positions, 1, 4, 5, 8, ... , 29, 32 (these being the / function input 
bits which after E and key XOR become outer bits of S-boxes). We therefore 
have: 

Lemma 2. Label the 2^® distinct additive cosets of the 16 dimensional subspace 

U = {(0,a:2,a:3,0, 0,^6, X7,0, . . . , 0 , 0 ^ 30 , 0 : 31 , 0 ) : x^ S Z 2 } 

of V 32 by U © oi, ...,[/ © 0216 . Then for every j and every round key k, there 
exists an I such that PS{E{U © Oj) © fc) C {/ © o;. 
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Notice that for any subset W of subspace U, we have U (B W = U, so 

{U © tti) © PS{E{U © aj) © fc)) = [/ © © a; = C/ © 

for some m. Therefore {U © C/ © aj)Rk = {U (B aj, U © am)- It is easy to see 
that the Rk act transitively on Vq 4 and we have 

Lemma 3. The 2^^ subsets {U © Ci, 17 © aj) of Vq 4 form a complete non-trivial 
block system for G, the group generated by the round functions of our cipher. 

The round functions of our cipher generate an imprimitive group where the 
blocks of a complete system are easily identified. Thus our cipher is susceptible 
to the basic attack described in Section 3 with 2^^ chosen plaintexts. Suppose 
further that a key-schedule is chosen such that over the 32 rounds, only 40 bits 
of the 80-bit session key are involved in XORs with outputs of the E expansion 
which become outer bits of the S-boxes. Then, in the terminology of Section 3, the 
permutation g is determined by only half of the session key bits and an exhaustive 
attack on those bits can be successfully carried out with knowledge of a handful 
of plaintext/ciphertext pairs. The remaining 40 bits of session key can then also 
be found by exhaustive attack, the total complexity of the attack being around 
2“^^ trial encryptions, well within the bounds of practicality. Notice that this 
attack depends crucially on the interaction between the system of imprimitivity 
and the key-schedule. 

6 Discussion and Conclusions 

We have considered attacks based on a property of a group associated with an 
iterated block cipher. The attacks motivate a new design criterion for iterated 
block ciphers: the group generated by the round functions should be primitive. 
Unfortunately this property seems to be hard to verify in practice. We note that 
DES and IDEA (probably, see IT^ l do satisfy this property. 

We have given an example of a cipher secure in some conventional senses 
but weak because of a deliberately inserted trapdoor. There are however some 
immediate criticisms that can be made of our example. Firstly, the S-boxes are 
incomplete (that is, not every output bit of the S-boxes depends on every input 
bit). This goes against a generally accepted design principle for S-boxes j I f I 
E] and would arouse suspicion. A close examination of the S-boxes and their 
interaction with the P permutation would then reveal our trapdoor. Incomple- 
teness in the S-boxes also leads to a block cipher where half of the ciphertext 
bits are independent of half of the plaintext bits. Thus our trapdoor is not so 
well hidden. Secondly and less seriously, our cipher’s resistance to differential 
attacks is not as high as one might expect from a 32 round system. 

Suppose however that our example cipher is not made public (for example, 
by using tamper-resistant hardware). We are then given a 64-bit iterated block 
cipher with 32 rounds and an 80-bit key and could be truthfully told by a 
panel of experts that it is secure against linear and differential attacks. The 
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incompleteness noted above can be hidden by applying a suitable invertible 
output transformation to the ciphertexts. Because of the size of the message 
space and choice of output transformation, we would then be unlikely to be 
able to detect any block structure just by examining plaintext/ciphertext pairs. 
Yet our example cipher contains a trapdoor rendering the system completely 
insecure to anyone with knowledge of the trapdoor. Clearly in this situation, we 
must have complete faith in the purveyor of the block cipher. 

We conclude by suggesting some avenues for further research. 

The choice of trapdoor in our example was forced upon us by a combination 
of the E expansion, the round key XORing and the bitwise nature of the P 
permutation. Can “undetectable” trapdoors based on more complex systems of 
imprimitivity be inserted in otherwise conventional ciphers? It is easily shown 
that, in a DES-like cipher, any system based on a linear sub-space and its cosets 
leads to a noticeable regularity in the XOR tables of small S-boxes. It seems 
that we must look beyond the “linear” systems considered here, or consider 
other types of round function. 

Our attention has been directed to block systems preserved by the group G, 
that is, on a per-round basis. It might also be interesting to look at the case where 
the round functions generate a primitive group, but the subgroup generated 
by the Ground cipher itself has a block structure. Attacks exploiting a block 
structure holding probablistically may also be powerful and worth examining. 
In this respect the thesis m is particularly relevant. 
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Appendix 

We present the S-boxes of our example block cipher in the same format as 
the DES S-boxes were presented in 0, that is each box is written as four rows 
of permutations: 



S-box 1 



8 


0 10 


1 


9 


3 


11 2 4 12 


7 14 


6 


15 


5 


13 


9 


5 10 


7 


8 


4 


11 6 14 1 13 0 


12 


2 15 


3 


14 10 15 11 12 


9 


13 8 1 5 


2 7 


0 


4 


3 


6 


11 


5 9 


4 


8 


6 


10 7 1 14 


0 12 


3 


15 


2 


13 












S-box 2 












1 15 0 12 


3 13 


2 14 6 9 


5 8 


4 


10 


7 


11 


11 


1 10 


2 


8 


0 


9 3 6 15 


7 13 


5 


12 


4 


14 


1 14 3 12 


0 15 


2 13 8 6 10 4 


9 


5 11 


7 


2 


5 1 


7 


0 


6 


3 4 15 8 14 9 


13 


10 12 


11 



S-box 3 

15 11 13 9 12 10 14 8 3 4 1 6 0 7 2 5 

0 14 1 12 2 15 3 13 10 6 8 5 11 7 9 4 

14 1 13 2 15 0 12 3 8 7 11 6 10 5 9 4 

4 12 7 13 6 14 5 15 11 3 8 2 9 0 10 1 

S-box 4 

12 3 6 1 4 11 14 9 7 2 15 10 5 0 13 8 

5 3 15 11 7 9 13 1 6 10 14 8 12 0 4 2 

4 9 14 11 12 1 6 3 2 7 0 15 10 13 8 5 

15 4 5 12 13 14 7 6 9 10 11 8 1 0 3 2 



S-box 5 

1 6 4 7 0 2 5 3 13 10 8 14 9 15 12 11 

2 4 7 0 6 5 3 1 9 14 13 15 8 10 12 11 

0 13 4 9 1 12 5 8 7 15 6 10 2 11 3 14 

11 2 15 7 14 3 10 6 1 13 4 12 5 8 0 9 
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S-box 6 

8 5 11 4 9 6 10 7 1 14 3 15 2 13 0 12 

7 3 6 0 4 1 5 2 9 14 11 13 8 12 10 15 

7 8 6 10 5 9 4 11 3 15 0 14 2 12 1 13 

12 6 15 7 14 4 13 5 2 11 1 9 0 8 3 10 

S-box 7 

12 3 15 1 14 2 13 0 11 5 10 7 8 6 9 4 

12 6 13 5 14 4 15 7 0 9 3 10 1 8 2 11 

1 12 3 14 2 13 0 15 9 7 8 4 11 6 10 5 

11 14 9 15 8 13 10 12 4 1 7 3 5 2 6 0 



S-box 8 

12 5 10 7 8 3 14 1 6 11 0 9 4 15 2 13 

11 12 13 8 9 10 15 14 2 3 0 1 6 5 4 7 

3 8 7 12 5 10 1 14 0 13 6 15 2 9 4 11 

5 13 3 9 1 11 7 15 10 0 8 6 12 4 14 2 
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Abstract. The DES has reached the end of its lifetime due to its too 
short key length and block length (56 and 64 bits respectively). As we are 
awaiting the new AES, triple (and double) encryption are the common 
solution. However, several authors have shown that these multiple modes 
are much less secure than anticipated. The general belief is that these 
schemes should not be used, as they are not resistant against attacks 
requiring chosen plaintexts. This paper extends the analysis by con- 
sidering some more realistic attack models. It also presents an improved 
attack on multiple modes that contain an OFB mode and discusses prac- 
tical solutions that take into account realistic constraints. 



1 Introduction 

Ever since the Data Encryption Standard m was adopted in the mid 1970s, 
the issue of its small key size has been raised. Nowadays a 56-bit key is clearly 
within the range of a dedicated exhaustive search machine Already in 

1979, Tuchman proposed the use of triple-DES with two or three keys m 
Double encryption was rejected quickly because Merkle and Heilman showed 
that a meet-in-the-middle requires ‘only’ 2®^ encryptions and a memory with 
2®® 112-bit values |23|. Later van Oorschot and Wiener came up with a more 
practical version of this attack, that requires 2^^ encryptions but only 16 Gbyte 
(other trade-offs are available). In the 1980s, triple-DES became popular; 
for example double length master keys were used to encrypt single length DES 
session keys. The best known attack on 2-key triple-DES is also by van Oorschot 
and Wiener m; it requires 2^^® * encryptions and 2* known plaintexts. This 
shows that 2-key triple-DES may provide increased strength against brute force 
key search. 

For encryption of more than one block, a mode of operation has to be de- 
fined different from the ECB (Electronic CodeBook) mode. The ECB mode is 
vulnerable to a dictionary attack, where an opponent collects ciphertexts and 
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corresponding plaintexts. The three other standard modes of operation are de- 
fined in FIPS 81 ^5- CBC (Cipher Block Chaining), CFB (Cipher FeedBack) 
and OFB (Output FeedBack). The limitation of CBC and CFB modes are the 
matching ciphertext attacks: after encrypting iP' blocks, information starts to 
leak about the plaintext (see for example m)- In the OFB mode, less informa- 
tion leaks, but the fact that the key stream has an expected period of 2®^ blocks 
also provides some information on the plaintext. For a formal treatment of the 
modes of operation, see Bellare et al. | 2 |. 

In the early 1990s, modes for multiple encryption were analysed. The most 
straightforward solution is to replace DES by two-key triple-DES and to use this 
new block cipher in a ‘standard’ mode (known as ‘outer-CBC’ and ‘outer-CFB’ 
m)- While for CFB and CBC mode this precludes exhaustive key search, the 
complexity of a matching ciphertext attack is still 2®^ blocks, as this depends 
on the block length only. This motivated research on interleaved or combined 
modes, where the modes themselves are considered as primitives. Coppersmith 
analysed some early proposals for two-key triple-DES modes in PEE!- The most 
straightforward solution is to iterate the CBC or CFB mode of a block cipher 
(known as ‘inner-CBC’ and ‘inner-CFB’). However, Biham showed that these 
simple interleaved modes are vulnerable to a 2®"* chosen ciphertext attack |4l5j . 

In [B|, Biham systematically analyses all the double and triple ‘interleaved’ 
modes, where each layer consists of ECB, OFB, CBC, CFB and the inverses 
of CBC and CFB, denoted with CBC“^ and CFB“^ respectively. Note that 
there are 36 double encryption schemes and 216 triple encryption schemes. His 
main conclusion is that “all triple modes of operation are theoretieally not mueh 
more seeure than a single eneryption. ” The most secure schemes in this class 
require 2®^ chosen plaintexts or ciphertexts, 2^® encryptions, and 2®® storage 
(for example, scheme 208 in |^). 

Biham also proposes a small set of triple modes, where a single key stream is 
generated in OFB mode and exored before every encryption and after the last 
encryption jOj. The conjectured security is 2^^^ encryptions. He also proposes 
several quadruple modes with conjectured security level 2®^® encryptions. Ho- 
wever, at FSE’98 Wagner shows that if the attack model is changed to allow for 
chosen ciphertext /chosen IV attacks, the security of all but two of these modes 
can be reduced to 2®® encryptions and between 2 and 2®^ chosen chosen-/F texts 

m- 

Coppersmith et al. propose the CBCM mode na, which is a quadruple mode; 
this mode has been included in ANSI X9.52 [[]. However, Biham and Knudsen 
present an attack requiring 2®® chosen ciphertexts and memory that requires 2®® 
encryptions 0. 

Many of these attacks are very intricate, but one cannot escape the con- 
clusion that these are only ‘certificational’ attacks. In most environments, it is 
completely unthinkable to carry out a chosen plaintext or ciphertext attack with 
more than 2^® texts (e.g., on a smart card). Moreover, attacks that require a 
storage of 2®® 64-bit quantities are not feasible today. This does not imply that 
we do not recommend a conservative design. Our goal is to explore which sche- 
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mes achieve a realistic security level today. For long term security, migration to 
AES (Advanced Encryption Standard) will provide a solution. 



Our contribution. The goal of this paper is to develop a better understanding 
of the security of the simpler structures such as 2-key triple and double modes of 
operation. We show that for common applications where a known IV attack can 
be applied, these modes are scaringly close to being in the range of exhaustive 
search or at least susceptible to Merkle-Hellman’s meet-in-the-middle attack m- 
We study double encryption schemes under different attack models (one of the 
two /F’s known, and replay of IV). We also present a new attack on certain 
double modes (the cycle attack), that reduces the plaintext requirement from 
2®^ chosen plaintexts to about 2^® known plaintexts and memory, at the cost of 
an increased work factor for the analysis (2®^ compared to 2®®); nevertheless we 
believe that this may be more realistic. Finally we compare some solutions for 
the cases where the integrity and/or secrecy of the /F’s is protected. Depending 
on the setting, one of the following three modes is recommended : double OFB, 
CBC followed by CBC“^, or the latter double mode masked with an OFB stream 
before each encryption and after the last. 

The rest of the paper is organised as follows: the next section discusses the 
notation and the attack models for the /F’s. Section 3 gives details on modes that 
can be broken by exhaustive search. Section 4 deals with modes that fall under 
the standard meet-in-the-middle attack (MITM) and Sect. 5 with modes that 
succumb to “narrow pipe” (the term “narrow pipe attack” is due to John Kelsey) 
or collision attacks. These three attacks are becoming more or less practical today 
because of the very low number of texts they require. In Sect. 6, we explain 
our new cycle attack and in Sect. 7 we compare several modes that provide a 
reasonable security level for current applications. Section 8 presents conclusions 
and open problems. 

2 The Setting 

In this section we introduce our notation and discuss the attack model in terms 
of control of the opponent over the IV . 

2.1 Notation 

We refer to Wagner’s paper m for notation throughout this paper. The suc- 
cessive blocks of plaintext and ciphertext in every multiple mode are denoted 
by Pq, Pi, P 2 ,. ■ ■ and Co, Ci, C 2 ,. . . . The standard single modes (ECB, CBC, 
CFB, OFB, CBC“^, and CFB“^) are combined to double or two-key triple mo- 
des using the notation X/Y and X/Y/Z respectively, where X,Y,Z are one of the 
above modes. As usual, we assume that the underlying block cipher is “ideal” in 
the sense that the modes are attacked by generic methods, and not by differen- 
tial jS| or linear cryptanalysis m for instance. We will be dealing exclusively 
with two keys Ki and K 2 - For two-key triple modes, Ki is the key of the first 
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and the last encryption components, and K 2 is the key of the middle decryption 
component. IV i and IV 2 are the initial values of the feedback and chaining 
modes, and for two-key triple encryption an additional IV 3 is required. Figure 1 
contains an example of a 2-key triple mode. 



Po Pi P2 




2.2 Models for the Initial Value 

We would like to stress that Biham’s attacks in usually consider the initial 
values IV to be unknown, except for some of the modes that are very hard to 
cryptanalyse otherwise. This is the main reason why many attacks require a 
huge number plaintexts or ciphertexts (typically about 2®®). On the other hand, 
Wagner chose to use a security model in which the IV’s may be chosen by the 
attacker. He mentions that his attacks may be converted into known IV attacks 
using slightly more /F’s (about 2^^). One can also consider for certificational 
purposes the more artificial scenario where only one of the IV's is known. 

We believe that for most applications known IV attacks are quite reasonable 
in the case of encryption as the IV 's are chosen by the encryption box but have 
to be transmitted with the ciphertext in order to be decrypted by the other 
party. In several practical protocols the IV's are transmitted in the clear. We 
may also want to allow a kind of “chosen” IV attack (in a chosen ciphertext 
setting) in which the adversary does not know the actual value of the IV but 
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is able to replay the same (possibly encrypted) IV a few times with different 
text queries. The result of our analysis is that under such threat models, basic 
double or triple modes are deeply flawed. 

In Sect. 7 we will also recommend schemes for scenarios where the /K’s are 
encrypted and/or where the integrity of the IV's can be protected. 

3 Divide and Conquer Strategies 

In 0, Biham analyses all 36 double modes (schemes 7 to 42) under the assump- 
tion that the /K’s are unknown. We are interested in a stronger attack model, 
and would like to find out which schemes still have a ‘reasonable’ security level 
against practical attacks. Therefore we analyse double modes for which the best 
known attack (with unknown /K’s) requires more than 2®^ chosen texts. Biham 
lists 15 such modes. 

We consider all of these modes under several known IV attacks and show 
that with a few known texts, their security drops down to the basic exhaustive 
search complexity of a 56-bit key. 

3.1 Known IV x and IV 2 Attacks 

Six modes are vulnerable to direct exhaustive search on each key, requiring only a 
handful plaintext/ciphertext pairs, about 2®^ encryptions and no memory. These 
modes are: OFB/ECB, ECB/OFB, CBC-^/OVB, OFB/CBC, OFB/CFB, and 
CFB“^/OFB. Note that there are three different modes and their inverses. There 
is no IV on an ECB mode: we will denote this as IV — 0. As an example we 
show how to recover the two keys of the CBC“^/OFB mode depicted in Fig. 2. 
Po Pi P 2 




Co Cl C 2 



Fig. 2. The CBC"VOFB mode 



The attack proceeds as follows. Choose a 3 block plaintext of the form 
(M, M, M) and get the corresponding ciphertext (Co,Ci,C 2 ) as well as IV 2 - Then 
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from the structure of the mode it follows that Ci © C2 = {IV 2) © E^^ {1^2)- 

Therefore we can exhaustively search for the key K2 satisfying this relation. 
Once this key has been found, it is straightforward to recover Ki. If more than 
one key pair is found, a few additional plaintext/ciphertext pairs suffice to pick 
the right pair. 

Another example is the CFB“^/OFB mode. Choose a message of the form 
(Po,Pi) and encrypt it twice. Get the corresponding ciphertexts (Co,Ci) and 
(Cq,C*). The IV’s will of course be different but the plaintext remains the 
same. Again, the relation between the two second ciphertext blocks is of the 
form: Ci © C* = El^^{IV 2) © Therefore we can first exhaustively 

search for K2 and then for Ki. 

Attacking the inverse modes requires mostly a chosen ciphertext attack where 
the IV' s, may either be chosen or just known to the attacker, depending on the 
context. The above modes can also be attacked under the assumption that only 
one of the IV's is known to the attacker. Typically, it can be shown that it 
suffices to know the initial value of the output feedback mode involved in every 
one of the six before-mentioned modes. 

3.2 Replay Attacks 

In this section we address a slightly different model of attack in which divide 
and conquer strategies may also apply. Here we assume that the attacker knows 
only one of the /F’s, but is given the ability to replay the other IV without 
any knowledge of the actual value of it. In other words, one of the IV's may for 
instance be encrypted to offer more security. Then the ability to replay the same 
encrypted unknown IV together with the knowledge of the second initial value 
leads to an attack requiring approximately only one exhaustive key search. In 
some cases, it may even be enough to have the ability to replay an IV without 
any knowledge about its value or the second initial value. This is the case in a 
chosen ciphertext setting. Note that Wagner mentions that some of his chosen- 
IV attacks might be converted into this kind of “replay” attack m- 

As an example we describe a chosen ciphertext replay attack on the CBC / OFB 
mode (see Fig. 3 ). Assume that the attacker knows IV i and has the ability 
to replay IV 2 without having access to its actual value. From the equality of the 
two output feedback streams, we know that for two chosen ciphertexts (Cq, Ci) 
and (Cq, C*) we have the following : Eki{Pq(BIVi)(BCo = Eki{PS ®IV*)(BCq. 
Therefore Ki can be found by exhaustive search. Next K2 is recovered by 
Ek, {Ek, {Po © IVi) © Co) = Cl © Ek, {Pi © Ek, {Pq © IVi)). 

This type of attack applies whenever the initial value of an output feedback 
mode may be replayed, and sometimes even when the initial value of a cipher 
feedback mode is replayed. 

4 Meet-in-the-Middle Attack 

This attack requires only a handful of plaintext/ciphertext pairs, and about 2 ®^ 
encryptions. The simple variant needs much more memory than the attacks of the 
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Po Pi P 2 




Co Cl C2 

Fig. 3. The CBC/OFB mode 



previous section, typically 2®® blocks. The latter requirement is currently hard 
to achieve. However, van Oorschot and Wiener show in that such a standard 
meet-in-the-middle attack can be modified to work with a memory of 16 Gbyte, 
at the cost of 2^^ encryptions; other trade-offs are possible. Their approach is 
based on cycle finding techniques using distinguished points. Therefore, when it 
comes to discussing threat models, we believe that a model in which about 2®® 
blocks have to be stored but only a few queries to the black box are needed is 
far more realistic than a scenario in which 2®® or more queries are made to the 
box (and possibly have to be stored anyway). We are focusing on attacks that 
require as few queries as possible. 

4.1 Double Interleaved Modes 

Again we make the assumption that both /F’s are known to the attacker. It is 
easy to see that in this model, none of the 15 double modes can be more secure 
than the ECB/ECB mode. (Note that this does not hold for secret IV attacks, 
which are definitely more interesting from the cryptanalyst’s point of view.) As 
explained in Sect. 3, six of these may also be attacked using exhaustive search 
on one key. Once again this proves that interleaved modes do not provide any 
additional security compared to standard encryption. In this particular setting, 
knowing only one of the /E’s and possibly replaying it or the other one does not 
allow to mount a meet-in-the-middle attack. 

As an example we show how an attack on CBC/CFB“^ proceeds (see Fig. 4). 
We always attack a single block message as in standard meet-in-the-middle 
attacks on two-key double encryption. Choose a fixed plaintext Pq £^nd get the 
corresponding ciphertext Cq. Then tabulate Ek^{IV \ ©Pq) © Go for every pos- 
sible value of the key Ki (store the results in a hash table) . Next compute every 
possible value of ^^ 2 ( 11 ^ 2 ) for every possible key K 2 and check for a match in 
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the table. The matches suggest a possible key pair. Rule out the wrong key pairs 
with a few additional plaintext /ciphertext pairs. 

The attack is essentially the same for every other mode. Just compute one 
half of the key into a hash table, and lookup for a match with the other half of 
the key. 



Po Pi P2 




4.2 Two-Key Triple Interleaved Modes 

We now address the case of some two-key triple interleaved mode, which is ac- 
tually the idea that motivated our research in the first place. We were wondering 
whether two-key triple modes are as secure as standard two-key triple ECB en- 
cryption (the best attack on this scheme is the one by van Oorschot and Wiener 
discussed in Sect. 1 ). 

The result of our investigation is that under a known IV attack, no such mode 
using a feedback mode on its inner layer is more secure than single encryption. 
Indeed, whenever a feedback mode involves the middle key, a mask is applied 
in between the two outer layers and can be computed into a hash table, while 
the outer layers may be computed on their own, and the exors of the results are 
looked up in the table. See Fig. 1 for a picture of the CBC/CFB/OFB mode. 

Query the encryption of a single plaintext block Pq, and for every possible 
key K2, compute Ek2{IV2) into a hash table. Next compute Eki{Po © /Fi) © 
EKiilV^) © Co for every possible key Ki and look the matches up in the hash 
table. Rule out the wrong key pairs with some additional plaintext/ciphertext 
pair as usual. 
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5 Narrow Pipes and Collision Attacks 

In this section we focus on threat models where one of the /K’s is unknown to the 
attacker and show that some cipher block chaining modes fall under collision or 
“narrow pipe” attacks. This setting may not be usual at all but the goal here is to 
understand which are the minimum requirements to mount a collision attack on 
double interleaved modes. From the structure of these modes, it is easy to see that 
the weakness comes from the chaining mode itself. The complexity of this kind of 
attack is still quite acceptable as it requires only about 2^^ plaintext /ciphertext 
pairs of a few blocks each, about 2®^ encryptions and 2^^ memory blocks. 

We show how this attack works on the CFB/CBC“^ mode (see Fig. 5) when 
IV 1 is not known to the attacker. Randomly encrypt plaintexts of the form 
(Po,Pi,M) where M is kept constant, and store the ciphertexts and associated 
IV 2 values. After about 2^^ trials, a collision occurs on the exor value of the 
second block of the first encryption layer. This collision propagates through the 
cipher feedback of the first layer as well as through the plaintext chaining of the 
second layer to all the following ciphertext blocks. Therefore such a collision has 
most probably occurred when a collision is found on the third ciphertext block. 
Now write the equality of the colliding samples as: 

Dk^ (Co © /F2) © Cl = Dk^ (Cq © IVl) © C/ 

and exhaustively search for the right key K 2 - Once K 2 is found, find the first 
key by exhaustive search using the equation: 

M®Dk-2{.C2®Dk2{.Ci®Dk^{,Cq®IV2))) = EKi{DK2{.Ci®DK-2{.Co®iy 2))) ■ 
This technique applies to several modes making use of the CBC or CFB mode. 



Po Pi P 2 




Co Cl C2 



Fig. 5. The CFB/CBC“^ mode 
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6 Cycle Attacks 

This attack is actually the dual of the narrow pipe attack. In this case we guess 
one of the keys (say, K 2 ) and peel off the corresponding layer. What remains 
is the output of an OFB-mode, which is in some sense a narrow pipe (64 bits). 
However, in this case, it is very unlikely that a collision will occur in a sequence 
of 2^® blocks (because the feedback function in OFB-mode is a permutation 
rather than an injective mapping). If a collision is observed, we know that our 
guess for the key K 2 was wrong. We will show that this attack requires about 
2^® plaintext blocks, 2®^ encryptions, and 2®® memory blocks (or 256 Gbyte). 

This attack applies to the following double modes: CBC“^/OFB, OFB/CBC, 
CFB“^/OFB, and OFB/CFB, even if the IV's are unknown; it represents a 
different trade-off than the attack by Biham (2®“^ chosen plaintexts, 2®® encryp- 
tions). The attack also applies to CBC/OFB, OFB/CBC“^, CFB/OFB, and 
OFB/CFB“^ if one of the /F’s is known. If the mode of which the IV is known 
is not the OFB mode, then one has to choose the plaintext to be constant (for 
example, all zeroes after the 2nd block) in order to make the mode behave like 
the OFB mode. 

Consider for example the OFB/CBC mode (see Fig. 6). The attack proceeds 
as follows. Collect a plaintext containing I = 2®^-^ blocks and the corresponding 
ciphertext. Guess K 2 , and peel off the CBC mode. One can now compute a 
sequence of £ blocks that should be the output of the OFB mode. Therefore, 
if the guess for K 2 was correct, one does not expect to see a collision (the 
probability that a random starting point lies on an OFB cycle shorter than £ 
blocks is given by £/2®'^, which is negligible in our casefl see for example Flajolet 
and Odlyzko cni). If the guess for K 2 was wrong, the effect is that one obtains 
a random sequence of blocks, that contains with high probability a collision. 
For £ pe \/^, this probability is given by 1 — exp(— A) with A = £^/2"+^; for 
£ = 2®'^-^ and n = 64, this is equal to 1 — Ri 1 — 6.8 • 10“^®. (Note that the 

number of collisions is Poisson distributed with parameter A given by the above 
expression.) On average, the collision will occur after y^7r/2 • 2"/^ blocks ^S|- 
If a wrong value of K 2 does not result in a collision (an event with probability 
exp(— A)), one has to try all values for Ki. The work factor of this attack is given 

by 




_ . 2"/2 . 2^-1 + exp 



£2 1 
2n+l 



(£-h2'=)2'=-i 



2^-1 



The first term is the expected work factor to eliminate guesses for K 2 that result 
in a collision. The second term corresponds to the guesses for K 2 for which no 
collision occurs, which implies that an exhaustive search for Ki is required. The 
last two terms correspond to the expected effort for the correct value of K 2 ] they 
are negligible compared to the first two terms. The second term decreases with 
£, and becomes negligible with respect to the first one if £ > 2®“^-^. The total 



^ Such a short cycle is easy to detect. 
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work factor is then approximately equal to 




At first sight, one might think that this attack also applies to the ECB/OFB 
and OFB/ECB modes. However, in this case a wrong key guess means that 
we encrypt the OFB sequence in double DES (with the correct key K2 and the 
wrong key guess ATJ respectively) . A double-DES encryption in ECB mode does 
not create collisions, which implies that it is not possible to distinguish between 
wrong and correct guesses. 

The attack also applies to eight 2 -key triple modes with OFB in the middle, 
where the first mode is CBC“^, CFB“^, or ECB and the last mode is CBC, CEB, 
or ECB (the only exception is the ECB/OFB/ECB mode). If the corresponding 
IV is known, the OFB mode is also allowed for the first or last encryption, the 
CBC“^ and CFB“^ are allowed for the last encryption, and CBC and CEB are 
allowed for the first encryption. 



Po Pi P2 




Co Cl Cl 



Fig. 6. The OFB/CBC mode 



7 What’s Left for Common Applications? 

In this section we first summarise our results. Subsequently we look at which 
pragmatic solutions are available to designers who want a short-term solution 
with an acceptable security level. 

7.1 Summarising Our Results 

The results of the previous sections are summarised in Table D We will denote di- 
vide and conquer attacks as DIV&C, replay attacks as RPL, meet-in-the-middle 
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attacks as MITM, collision attacks as COLL, and cycle attacks as CYCLE. We 
consider different known IV cases, as well as the case where no IV is known. 
The associated complexities are the following (known or chosen plaintexts/off- 
line computations/memory requirements): 

— Divide and conquer or Replay attacks: 4/2®^/- ; 

— Meet-in-the-middle attacks: 4/2®^/2®®, 4/2®®/2"^°, or 4/2^^/2^^; 

— Collision attacks: 2^^/2®^/2^^ (chosen plaintexts); 

— Cycle attacks: 2^®/2®^/2^® (1 single plaintext, sometimes chosen; see Sect. 6). 

Table Q] shows how vulnerable double modes can become if the attacker ob- 
tains information about the initial values or can manipulate these. We would 
like to stress that this is the case in many applications and that designers should 
make the right choices having these numbers in mind. 



Table 1. Double modes under known/unknown IV attacks 



Mode 


Known 
IV X and IV 2 


Known 

IVi 


Known 

IV2 


Unknown 

/U’s 


ECB/OFB 


- 


- 


DIV&C 


RPL 


OFB/ECB 


- 


DIV&C 


- 


RPL 


CBC/CBC-^ 


MITM 


COLL 


COLL 




CBC/OFB 


MITM 


CYCLE / RPL 


CYCLE 




CBC/CFB"! 


MITM 


COLL / RPL 


COLL 




CBC-VOFB 


DIV&C 


CYCLE / RPL 


DIV&C 


CYCLE / RPL 


OFB/CBC 


DIV&C 


DIV&C 


CYCLE / RPL 


CYCLE / RPL 


OFB/CBC-i 


MITM 


CYCLE 


CYCLE 




OFB/OFB 


MITM 


CYCLE / RPL 


CYCLE / RPL 




OFB/CFB 


DIV&C 


DIV&C 


CYCLE / RPL 


CYCLE / RPL 


OFB/CFB-i 


MITM 


CYCLE / RPL 


CYCLE / RPL 




CFB/CBC"^ 


MITM 


COLL 


COLL / RPL 




CFB/OFB 


MITM 


CYCLE / RPL 


CYCLE / RPL 


RPL 


CFB/CFB-i 


MITM 


COLL / RPL 


COLL / RPL 




CFB"VOFB 


DIV&C 


CYCLE / RPL 


DIV&C 


CYCLE / RPL 



7.2 Discussion 

An important question is: which solutions remain with a reasonable security level 
that require only two keys? This implies that we are not worried about attacks 
that require more than 2®° chosen plaintexts or ciphertexts, or that require a 
work factor of 2^°° or more. The choice of these numbers is rather arbitrary; note 
however that it is easy to prevent attackers from having access to more than 2®*^ 
plaintext/ciphertext pairs by changing the keys more frequently, or by taking 
the system out of service early. However, once encrypted data is made public. 
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an opponent can record it and wait 20 or 30 years (or even more) before he 
attempts to decipher it. We also assume that the keys are generated or derived 
in such a way that related key attacks are precluded [5| . 

We distinguish between four cases: 

— if the /K’s are encrypted and their integrity is protected by a MAC (which 
also should prevent replay), we recommend to use the simple OFB/OFB 
mode (scheme 28 in |S|). The best known attack described in jH] has comple- 
xity 2®®/2®^/2®'^; such an attack is not realistic in many environments. Note 
also that chosen-/F attacks are precluded by the use of the MAC algorithm. 
This mode provides no error propagation; if the authenticity of the informa- 
tion is a concern, it is recommended to calculate a MAC over the /F’s and 
over the plaintext. 

As a MAC algorithm, MacDES could be used 1201 ; this algorithm extends 
the well known CBC-MAC with double-DES in the first and last encryption 
(but with different but related keys for the 2nd encryption). MacDES seems 
to provide high security at relatively low cost; forgery attacks are not feasible 
if the number of plaintexts encrypted with a single key is reduced to 2^^ (or a 
little less), and the best known key recovery attack requires 2®® chosen text- 
MAC pairs, 2®® encryptions, 2®® MAC verifications, and 2®"^ bytes memory. 
Wagner discusses the use of encrypted and authenticated IV 's and argues 
that “adding this much complexity to the system may begin to test the limits 
of one’s comfort zone, However, our point of view is that the currently 

known attacks on multiple modes tend to be very complex as well; moreover, 
MAC algorithms are probably better understood than encryption modes. It 
is of course important to apply appropriate key separation techniques (this 
may require additional evaluation). 

— If the IV’s are encrypted but their integrity is not protected by a MAC, 
we recommend to use the CBC/CBC“^ mode (scheme 15 in ^). The best 
known attack described in has complexity 2®®/2®®/2®®. This mode seems 
to provide better security against IV replay attacks than the previous one. 
In order to simplify the scheme, one can choose IV i = IV 2 - 

— We do not recommend a double mode where the IV’s, are not encryp- 
ted but their integrity is protected. If follows from Table 0 that all these 
scheme succumb to a meet-in-the-middle attack that requires only a few 
plaintext/ciphertext pairs. For this case, we suggest the OFB[CBC,CBC“^] 
mode proposed by Biham 0; this notation means that one first applies the 
OFB mode, then CBC, then OFB (with the same key stream), then CBC“^ 
and finally OFB (again with the same key stream). Wagner asserts that his 
chosen-/F attacks do not apply to this scheme |^. Our preliminary eva- 
luation suggests that the security level of this mode is still sufficiently high 
even if the same key is used for the CBC and CBC“^ encryption. 

— If the IV’s are not encrypted and their integrity is not protected by a MAC, 
one could also use the previous scheme; however, we do not recommend this 
solution. 
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We understand that making such recommendations is a little risky; indeed, 
there are certificational attacks on these schemes (except for the last one), and in 
the past years significant progress has been made in the cryptanalysis of multiple 
modes. On the other hand, we believe that it is important to point out to the 
research community (and to practitioners) that for some of the schemes, we are 
not aware of any realistic attacks. 

We also recall a number of other schemes that can serve as “reference points” 
(note that the major motivation for introducing new modes was precisely to avoid 
the drawbacks of the first two of them): 

— 2-key triple-DES in outer-CBC mode: the main attacks here are a matching 
ciphertext attack and the van Oorschot-Wiener attack m with the following 
parameters: 

— DESX in CBC mode |T7|: the matching ciphertext attack applies as well; 
the security bound is 2‘/2^^®“*/2‘. A disadvantage is that this solution has 
a smaller margin to shortcut attacks (differential and linear cryptanalysis) 
than all double or triple modes. 

— DEAL-128 in CBC mode m- this is certainly an interesting alternative. 
The best known attack on this cipher is 2^°/2^^^/2®^, where the texts are 
chosen plaintexts. A small disadvantage is the slow key schedule (6 DES 
encryptions). We believe that further research is necessary on this solution, 
as DEAL is a new block cipher rather than a mode (see also |22|1. 

Table El compares the efficiency of the solutions proposed. The security level 
corresponds to the best attack known. For DESX this is the security bound 
proved, but if the underlying block cipher is DES, shortcut attacks apply with 
a lower complexity (differential |H| and linear cryptanalysis |23)- If fli® TV’s are 
encrypted, this requires 3 encryptions per IV (2-key triple encryption), except 
for DESX, where IV is encrypted using DESX. For the OFB/OFB scheme, it is 
assumed that the MAC algorithm is applied to both the IV’s and the plaintext; 
this implies that this variant also provides guarantees on the message integrity. 
If the MAC is applied only to the IV’s, the number of encryptions drops to 
2t -I- 10. Note that for the CBC/CBC“^ scheme a single IV is used. The MAC 
algorithm used is MacDES jSOj; it requires t + 2 encryptions and requires that 
t > 2. 

8 Conclusions and Open Problems 

We have analysed the security of double and 2-key triple modes under the as- 
sumption that information on the IV’s is available. Under this model, most of 
these schemes are practically insecure. This extends the work of Biham who has 
shown that these modes are theoretically insecure when the IV’s are secret. We 
have also introduced a new attack, the cycle attack, that reduces the plaintext 
requirement for certain double and triple modes (at the cost of an increased 
number of off-line encryptions). 
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Table 2. Summary of properties of several schemes when encrypting a t-block plaintext 



mode 


encrypt 

IV’s 


authenticate 

IV’s 


number of 
encryptions 


security 


OFB/OFB 


yes 


yes 


3t-M0 


265/265/264 


CBC/CBC"! 


yes 


no 


2t-h3 


268 / 266/266 


OFB[CBC/CBC"i] 


no 


yes 


3t-b5 


2/2“V- 


2-key triple-DES 
outer-CBC 


yes 


no 


3t-\-3 


2*/2^^°“V2* 

2®^ Match. Ciph. 


DESX in CBC 


yes 


no 


t 1 


2t/2ii9-t/2t 
2®^ Match. Ciph. 


DEAL in CBC 


yes 


no 


3t-\-3 


270 /2121/267 



We have also compared the security level and performance of a number of 
alternatives that offer a reasonable security level against attacks that require 
less than 2 '^'^ known or chosen texts. Some of these schemes seem to provide a 
simple solution that is easy to analyse, but we caution the reader against too 
much optimism. We leave it as an open problem to improve the attacks on the 
schemes listed in Table |3 
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Abstract. Whereas a block cipher enciphers messages of some one par- 
ticular length (the blocklength), a variable-input- length cipher takes mes- 
sages of varying (and preferably arbitrary) lengths. Still, the length of 
the ciphertext must equal the length of the plaintext. This paper intro- 
duces the problem of constructing such objects, and provides a prac- 
tical solution. Our VIL mode of operation makes a variable-input-length 
cipher from any block cipher. The method is demonstrably secure in 
the provable-security sense of modern cryptography: we give a quantita- 
tive security analysis relating the difficulty of breaking the constructed 
(variable-input-length) cipher to the difficulty of breaking the underlying 
block cipher. 

Keywords: Ciphers, Modes of Operation, Provable Security, Symmetric 
Encryption. 



1 Introduction 

This paper introduces the question of how to construct ciphers which operate on 
messages of varying lengths. Such a cipher, F ^ maps a key K and a plaintext M 
in {0, 1}^ (or M in some other set containing strings of various lengths) into 
a ciphertext G = Fk{M) having the same length as M. Note that the length 
of M is not restricted to some Dxed blocklength n, or even to some multiple of a 
blocklength. At the same time, being a cipher, Fk is a length-preserving permn- 
tation for which possession of K enables the eD dent computation of both F^ 
and 

The ciphers we construct have a strong security property: we want that no ef- 
Dcient adversary can distinguish an oracle for T)y(P), for a random and secret A', 
from an oracle for a random length-preserving permutation D(D) (having the 
same domain as T/y). This is the (now customary) requirement for a block ci- 
pher (security in the sense of being a “pseudorandom permutation,” or “PRP”) 
originally suggested in [ I i )l4j , and so it is the property we want for any variable- 
input-length cipher as well. 

L. Kiiudseii (Ed.): FSED39, LNCS 1636, pp. 231-^^^ 1999. 

Cb Springer- Verlag Berlin Heidelberg 1999 
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One could try to construct a variable-iuput-length cipher from scratch, in 
the confusion/diDusion tradition. But that approach is specialized and error- 
prone. Instead, we provide constructions which assume one already has in hand 
some underlying block cipher. We give a “mode of operation” — VIL mode (for 
“yariable-input-length enciphering” ) which enciphers strings of arbitrary length 
(but at least n) using an n-bit block cipher. 

We prove the soundness of VIL mode in the provable-security sense of modern 
cryptography: if the underlying block cipher is secure then so is the variable- 
input-length cipher we construct from it. VIL is actually more than one parti- 
cular mode of operation; it is an approach for making a variable-input-length 
cipher that can be realized in many diDerent ways. 

Why variable-input-length ciphers? The obvious use of variable-input- 
length ciphers is to encrypt (ie, provide privacy protection) without any increase 
in message length. Suppose weQl be encrypting messages AIi , M 2 , ODD where the 
lengths of these message may vary. We want to create ciphertexts C\,C 2 , ODD 
where \Ci\ = \Mi\ and where ciphertext Q hides everything about d/j (with 
respect to eD cient computation) except for the length of Mi and which earlier 
plaintext, if any, equals Mi . 

It is important to understand that the last sentence embodies a weaker no- 
tion of privacy than the customary one — semantic security, and its equivalent 
formulations m- A semantically secure encryption computationally hides all 
information about Mi except for \AIi \ — in particular, one does not allow to be 
leaked which earlier plaintext (if any) a given ciphertext corresponds to. But 
you pay a price for this added security — semantically secure encryption cannot 
possibly be length preserving. Thus length-preserving “encryption” (enciphe- 
ring) embodies a tradeoD : shorter ciphertexts at the cost of an inferior security 
guarantee (and slower encryption/decryption). 

Is this tradeoD a good one? If you douQ; know anything about how the en- 
cryption will be used, then weQl have to say no. But there are applications when 
the tradeoD is a good one. Let us give an example. 

In networking applications a “packet format” may have been deDned, this 
packet format having various Delds, none of which were intended for cryptogra- 
phic purposes. Now suppose a need arises to add in privacy features but, at the 
same time, it is no longer desirable (or feasible) to adjust the packet format. 
It cannot be lengthened by even one bit. Enciphering with a variable-inpiit- 
length cipher leaves the packet size alone, and it leaves packets looking identical 
(after deciphering) to the way they looked before. This contributes to ease-of- 
acceptance, an easier migration path, and better code-reuse. These factors may 
outweigh the security consideration that we will be leaking which packets of a 
session are identical to which earlier ones. 

As a second example, we may have a priori reason to believe that all the 
plaintexts AIi, AI 2 . DDDwill be distinct. For example, each message may be known 
to contain a sequence number. In such a case the additional piece of information 
that secure encipherment leaks amounts to no information at all, and so here 
enciphering provides a way to achieve semantic security in a way that is both 
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length-minimal and oblivious to the formatting conventions of each message 
(eg, where the sequence number appears in each message). This obliviousness 
contributes to the making of robust software; when message formats change the 
cryptography need not be adjusted. With typical length-minimal approaches this 
would not have been true. 

Variable-input-length ciphers may prove to be useful tools for protocol de- 
sign. As an example, Rivest put forward the idea of “strongly non-separable 
encryption” wherein an adversary with a ciphertext C who guesses an en- 
cryption key K should have to invest D (IC'D time before obtaining information 
useful to verify if C was enciphered under K. Variable-input-length enciphering 
provides a simple way to provably achieve RivestH goal. 

The difficulty. It is not so clear how to construct a secure variable-input- 
length cipher from a block cipher. We are making a stringent security require- 
ment: we expect our ciphers to approximate a family of random permutations. 
In addition, we want them to be length-preserving permutations. This elimina- 
tes any hope of using conventional modes of operation. Consider, for example, 
using DES in CBC mode with a zero initialization vector (IV). For simplicity, 
assume the message length is a multiple of the blocklengthlJ This does not give 
a cipher that approximates a family of random permutation: if two plaintexts 
agree on blocks !,...,* then their ciphertexts agree on blocks 1. ... ,i, which is 
almost never true of a random permutation. To get around this one might try 
to make the IV some sort of hash of the message — but then how could one get 
a length-preserving construction? 

Our method. We suggest simple and eD dent ways for making variable-input- 
length ciphers from block ciphers. Our VIL mode of operation makes two passes 
over the message. In our preferred instantiation, the Drst pass computes some 
sort of CBC MAC over the message d/, while the second pass encrypts M (in 
counter mode, for example) using the pass-one MAC as the IV. However, one 
cannot take the ciphertext C for M to be the pass-two ciphertext (including the 
IV), since this would be too long. Instead, we exploit a certain feature of the 
CBC MAC, which we call its “parsimoniousness.” This enables us to drop one 
block from the pass-two ciphertext and still be able to recover the plaintext. 
(This is the main idea of our construction.) There are some technical matters 
that complicate things; see Section |21 and Fig. Q] 

Our approach can be instantiated in many further ways; it actually en- 
compasses many modes of operation. We describe VIL mode in terms of two 
specialized-tools: what we call a “parsimonious” pseudorandom function (PRF) 
and a “parsimonious” encryption scheme. Both of these tools can be construc- 
ted from block ciphers, and we show a few ways to do this. Thinking of VIL 
mode in these general terms not only provides versatility in instantiation, but, 

^ The difficult issue is not in dealing with messages of length not a multiple of the 
blocklength; there are well-known methods for dealing with this, like stream-cipher 
encrypting the short block and ciphertext stealing. See cni Chapter 2] for a descrip- 
tion of these techniques. 
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equally important, our proof of correctness is made much simpler by the added 
generality: what is irrelevant is out of sight, and what is relevant can be singled 
out and separately proved, in part by invoking known results mm- 



Related work. There is a quite a lot of work on constructing block ciphers of 
one blocklength given block ciphers of another blocklength. Luby and RackoD 
consider the question of how to turn an n-bit to ?z-bit pseudorandom function 
(PRF) into a 2n-bit to 2n-bit block cipher. They show that three rounds of 
the Feistel construction suD ces for this purpose, and that four rounds suD ce to 
obtain a “super” PRP from a PRF. The paper has spawned much work, with 
to name a few. 



Naor and Reingold provide a construction which extends a block cipher 
on ?r-bits to a block cipher on N = 2ni bits, for any desired f D 1. A variation 
on their construction yields a cipher on N = ni bits for any i D 1 ng. It is 
unclear how to use these constructions for arbitrary N (meaning not necessarily 
a multiple of n) and across assorted input lengths. 

Lucks CH generalizes Luby-RackoD to consider a three round unbalanced 
Feistel network, using hash functions for round functions. This yields a block 
cipher on any given length N by starting with a PRF of r bits to D bits and 
another of D bits to r bits where r + D = A^. Of course this requires the availability 
of the latter primitives for given values of r, D. 

Anderson and Biham P provide two constructions for a block cipher (BEAR 
and LION) which use a hash function and a stream cipher. This too is an unba- 
lanced Feistel network. 

Some ciphers which are intended to operate on blocks of various lengths have 
been constructed from scratch. The CMEA (attacked by m) is an example. 

A “forward-then-backwards” mode of operation is described in 0, under 
the names “Triple-DES Key Wrap” and “RC2 Key Wrap.” While not length- 
preserving, a length-preserving variant is possible, and it might be a good cipher 
across messages of assorted lengths. See Section 0 for further discussion. 

We have already mentioned RivestQ; “strongly non-separable” encryption m 
and that variable-input-length enciphering provides one mechanism to achieve 
that goal. 

The VIL mode of operation was invented in 1994 when the authors were at 
IBM 0. No security analysis was provided at that time. 



2 VIL Mode Example 



In this section we describe one particular instantiation of VIL mode enciphering. 
For concreteness, let us start from DES, a map DES : {0, 1}^® D {0, 1}" ^ 
{0, 1}" where = 64. Using this map we construct the function F : {0, ^ [] 

{0. 1}^ ^ {0, 1}^ for enciphering strings of length at least 64. (Extending 

to messages of length less than 64 will be discussed later.) Given a key K = 
K1 II K2 II A'3, partitioned into three 56-bit pieces, and given a plaintext M G 
{0, 1}° form the ciphertext C — Fk{M) as depicted in Eig.Qand as speciDed 
here: 
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Algorithm Fm || k2 |[ K3 (M) 

(1) Let MpreDx be the Drst \M\ bits of M. Let AfguD x be the remaining bits. 

(2) Let pad be a “1” followed by the minimum number of “0” bits such that 
I A/ 1 + I pad I is divisible by 64. 

(3) Partition AfpreOx || pad || MsuH x into 64-bit blocks Adi DDDAdm,. 

(4) Let Co = 0”, and let Ci = DESxi(C'i®i foi' all 1 D * D ttl. 

(5) Let D = DESk 2 (C„). 

(6) Let P be the Drst |Ad| n bits of 

DESi^3(D) II DESi^3(D +1) II DESk3(D + 2) DDD . 

(7) Let tdpreDx = D Adp^-eOx. 

(8) Return ciphertext C = D || CpreDx- 

The computation of C can be looked at as having two stages. In the Drst stage 
(Steps 1-5) we compute D , which is some sort of CBC-MAC of Ad under the key 
A'l II K2. In the second stage (Steps 6-7) we encrypt Ad, except for AdO; last 
64 bits, under key A'3. We use counter- mode encryption with an initialization 
vector of D. The ciphertext is the MAC D together with the encrypted preDx 
of Ad. 

The MAC D is not computed by the “basic” CBC-MAC, but some variant of 
it. Our constraints preclude using the CBC-MAC in its customary form. First we 
need to be able to properly handle messages of arbitrary length (the basic CBC- 
MAC is only secure on messages of some Dxed length, this length being a multiple 
of the blocklength) . But in addressing this issue we must ensure that given D 
and an |Ad| ®64 bit preDx of Ad, we are able to reconstruct the last 64 bits of Ad. 
That this can be done can be seen in the following algorithm for computing 
F^lii k 2 \\ As before, C € { 0 , 1 }°®^ and K1,K2,K3 € {0,1}®®. The 

existence of the following algorithm demonstrates that F is indeed a cipher. 

Algorithm F^l || y k 3 

(1) Let D be the Drst 64 bits of C. Let CpreDx be the remaining bits. 

(2) Let P be the Drst |CpreDx| bits of 

DESiC3(D) II DESiC3(D +1) || DESk3(D + 2) DDD . 

(3) Let A/preDx = A D CpreDx- 

(4) Let pad be a “1” followed by the minimum number of “0” bits such that 
|AdpreDx| + |pa-d| is divisible by 64. 

(5) Partition AdpreDx || pad into 64-bit blocks Adi DDDAdm®i. 

(6) Let Co =0", and let Ci = DESici(Ci®i D AC) for all 1 D iW m® 1. 

(7) Let Ad™ = DES|\(DES|i (D))) D C™®i. 

(8) Return Ad = AdpreDx || Ad„ . 

The interesting step is Step 7, where one exploits the structure of (this version 
of) the CBC-MAC to compute the last block of plaintext. 




236 M. Bellare and P. Rogaway 




Cprefix 



Fig. 1. An example way to realize VIL-mode encipherment. Here we use the block 
cipher DES. In this example the message M to encipher is a few bits longer than 64 x 3 
bits. The underlying key is K — K1 \\ K2 || K3. The ciphertext is C = a || Cprefix- 



We remark that standard methods, like setting Ki = DES/y(i)[1..56], would 
allow Kl, K2 and K3 to be derived from a single 56-bit key, in which case F 
would be a map F : {0, D {0, 1}° ^ {0, 1}° 6^. 

We also remark that that the domain can be extended to all of {0, 1}^ (that 
is, we can encipher strings of fewer than < 64 bits) using methods which we 
will later discuss. However, these methods have not been proven secure with 
desirable security bounds. 

It should be kept in mind that the above example is just one way to in- 
stantiate VIL-mode encipherment. Both stages (the computation of D and the 
encryption of A/ppeOx) can be accomplished in other ways. We now move towards 
these generalizations. 
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3 The General Approach 

Towards the general description of VIL and its proof of correctness we now make 
some deDnitions. 

Preliminaries. A message space Ad is a nonempty subset of {0, 1}^ for which 
M € A4 implies that AI' € A4 for all M' of the same length of M. A ciphertext 
space (or range) C is a nonempty subset of {0, 1}^. A key space /C is a nonempty 
set together with a probability measure on that set. A pseudorandom function 
(PRF) with key space /C, message space Ad and range C is a set of functions 
F — {Fk \ K £ K,} where each F^ : Ad ^ C. We usually write F : ICU Ad ^ C. 
We assume that \Fk{M) \ depends only on |1\/|. A cipher is a PRF A : /CD Ad ^ 
Ad in which each Fk ; Ad ^ Ad is a bijection. A block-cipher is a cipher 
A : AC D {0, 1}" ^ {0, 1}". The number n is called the blocklength. 

Let Ad be a message space and let D : N ^ N be a function. We deDne “refe- 
rence” PRFs Rand(Ad,D) and Perm(Ad). A random function D Rand(Ad,D) is 
deDned as follows: for each M C Ad, let D(A/) is a random string in {0, 

A random function D PermM is deDned as follows: for each number i such 
that Ad contains strings of length /, let be a random permutation on {0, 1}*, 
and deDne D(C\/) = Di(C\/) where i = |C\/|. 

We deDne security following 0, adapted to concrete security as in Hj. A 
distinguisher is a (possibly probabilistic) algorithm A with access to an oracle. 
Let A be a distinguisher and let F = {Fk \ K G AC} be a PRF with key space AC 
and \Fk(AI)\ = D(|il/|). Then we let 

AdyP"^(A) = Pr[A: K. : = 1] ® Pr[D Rand(Ad, D) : = 1] and 

AdvPl'P(A) = Pr[A: K. : = 1] ® Pr[D Perm(Ad) : A° = 1] . 

DeDne the functions Advp^(t, q, D ) = max^{Adv^‘^(A)} and Adv^‘'’(t, (jf, D) = 
max 4 {Adv^*'’(A)} where the maximum is over all adversaries which run in time 
at most t and ask at most q oracle queries, these queries totaling at most D bits. 
We omit the argument D when Ad contains strings of just one length. Time is 
always understood to include the space for the description of the distinguishing 
algorithm. Throughout, if the distinguisher inquires as to the value of oracle / 
at a point A/ ^ Ad then the oracle responds with the distinguished point T. We 
assume there is a (simple) algorithm to decide membership in Ad and so we may 
assume adversaries do not make such queries. 

Parsimonious PRF. Let G : {0, 1}^ D Ad ^ {0, 1}” be a PRF where Ad only 
includes strings of length at least n. Then G is said to be parsimonious if for 
all K S AC and all AI G Ai, the last n bits of AI are uniquely determined 
by the remaining bits of AI, the key K, and Gk{AI). In other words, with a 
parsimonious PRF G, if you know K and receive the n-bit value D = Gk{^J) 
then you donD: need to receive all of AI in order to know what it is: it is suD cient 
to get the I Af bit preDx of A/, Alpi-eOx' from that you can recover the n missing 
bits by applying some function Recoverx(A/pieDx, D) associated to the PRF. 
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Examples. The (basic) CBC-MAC is a parsimonious PRF. Assume a block 
cipher E : fCW {0,1}" ^ {0,1}". Fix a constant 7n D 1. Consider the PRF 
G : A D {0, 1}"™ ^ {0, 1}" deDned by Gk{Mi = G„, where Gq = 0" 

and Ci = Ek{Mi D Gi®i) for 1 D * D m. To recover Aim from A', AR DDDh/„n^i, 
and D = G/<-(A/i DDDA/„j), compute Go, Gi, . . . , G„i®i by Go = 0” and Ci = 
Ek{Ch^i D Ali) and then, since Cm = EK(AIm D Gm®i), recover Aim as Aim = 
D G„i®i- Note that it is crucial that we use the “full” CBC MAC (that is, 
the MAC is all of G™, not a proper preDx). In P] it is shown that the CBC MAC is 
secure whenever E is, in the sense that Advg^(t, q) D Adv^‘^(F, q')+3(7^m^2®”®^ 
where t' U t and q' = qm. 

The computation of D in the algorithm of Section O builds on the idea descri- 
bed above. We extend the CBC-MAC variant analyzed in to domain {0, 1}^ , 
doing this in a way that retains parsimoniousness (padding the second-to-last 
block instead of the last one). This CBC-MAC variant is once again secure. Let 
C : JC'^U jo, 1}^ ^ {0, 1}" be the PRF obtained from the block cipher E by the 
method illustrated in Lines 1-5 in the description of Algorithm Exi y k 2 || kz{EI) 
in Section 13 (where we had E = DES). Then the results of jSJ can be adapted 
to establish that that Adv^^'^ (t, g, D ) D 2 DAdv^''^ (C, q') + {\l/n + 9 )^ 2 ®" -h 2®" 
where t' W t and q' = W /n + q. 



Parsimonious encryption. A parsimonious encryption scheme is a triple of 
algorithms S = {K,,£,'D). Algorithm K, returns a random element from the key 
space (which we likewise denote JC). Encryption algorithm £ ■. fCW AA takes a 
key K € 1C and AI G A4, chooses a random IV {0, 1}", and then encrypts 
the message AI into a ciphertext G = IV || G^, where |G^| = |A/|. The process 
is denoted G £k{M). or G £k{AI; IV) when we regard IV as an explicitly 
given input to £. The decryption algorithm has domain K. D {0, 1}^ and, given 
K G 1C and G G {0,1}°, Vk{C) = AI whenever G = £k{M\ IV) for some 
AI G M and IV € {0, 1}”. 

We deDne security following The idea is that an adversary cannot distin- 
guish the encryption of text from the encryption of an equal-length string of 
garbage. Let S = {1C,£,V) be a parsimonious encryption scheme and let A be a 
distinguisher. Then 



Adv^^(S) = 



Pr 



IC 1C : A 



£k(^ _ 



= 1 



)Pr 



K V : (*'“')= 1 



In the Drst experiment the oracle, given A/, returns a random encryption of AI 
under K , while in the second experiment it returns a random encryption of a 
random string of length |A/|. DeDne Advg“''(t, ( 7 , D ) to be max 4 {Adv 5 "''(A)} 
where the maximum is over all adversaries who run in time at most t and ask 
at most q oracle queries, these totaling at most D bits. 



Examples. Common methods of symmetric encryption using a block cipher are 
parsimonious. For example, CBC-mode encryption with a random IV is par- 
simonious. Its domain is AI = ({0,1}")+, where n is the blocklength of the 
underlying block cipher. The domain for CBC-mode encryption is easily enlar- 
ged to AI = {0, 1}°; for example, if the last “block” Aim of plaintext has length 



On the Construction of Variable-Input-Length Ciphers 239 



/ 




\ 




^prefix 


^suffix 


\ 






/ 






Aprf^ 


G 


^enc ^ 


s 




parsimonious 




parsimonious 

encryption 




PRF 




a = IV 



Fig. 2. A general description of VIL mode. The ciphertext is a || Cpreflx- The value 
a is the output of the PRF G, the IV to the encryption scheme, and the first n bits of 
ciphertext. 



less than the blocklength n then encrypt it as Cm = EK{Cm®i)[^--\^^m\) D ^4n- 
Alternatively, counter-mode encryption (with a random initial counter) is parsi- 
monious and has domain {0, 1}^. This was the choice for Stage 2 in our example 
scheme of Section |21 The security of CBC-mode and counter-mode encryption 
are established in |^. 



VIL: General scheme. We are now ready to give the general description of 
VIL mode. Let M.' be a message space, let D 1 be a number, and let Ai — 
M'{0, 1}" (strings ?r-bits longer than strings in M). Let G : /Cprf D M. {0, 1}” 
be a parsimonious PRF, and let Recover : /Cenc D Ad' D {0, 1}” ^ {0, 1}" be its 
associated recovery algorithm. Let S = (/C,f,P) be a parsimonious encryption 
scheme in which £ : /Cenc D Ad' ^ Ad. Then we construct the cipher F = 
VIL[G. S], where A : AC D Ad ^ Ad, by setting AC = ACprf D ACenc and deDning: 



Algorithm fk,. II m 
C'VeDx = M [l..\M\®n] 

D = Gk^JM) 

GpreDx — 4/Cgnc ) 

return C = D || CpreDx 



Algorithm |j (G) 

D be the Drst n bits of G 

A/preDx = 

CldsuD X (Afpi-eDx: 0 ) 

return A I = Cl/ppeOx || CLsuD x 



For a picture of the general scheme, see the Fig. El 



4 Analysis 

The following theorem says that F as constructed above is a secure variable- 
input-length cipher, as long as both G and S are secure. 
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Theorem 1. Let F = VIL[G, S] be the eipher obtained from the parsimonious 
PRF G : /Cprf U A4 ^ {0, 1}" and the parsimonous encryption scheme S = 
(JC,S,V). Then 

AdvP"P(t, g, D) D gr, D) + Adv^''‘''(t', g, D) + |^ , 

where t' = t + 0{qn + □ ) . 

Proof. Let A be an adversary attacking A, and let t be its running time, q the 
number of queries it makes, and D the total length of all its queries put together. 
We assume without loss of generality that A never repeats an oracle query. This 
is important to some of the claims made below. We consider various probabilities 
related to running A under various diDerent experiments: 

P^=Pt[K = 

P 2 = Pr[ife„c /Cenc ; 9 Rand(M,n) : A^x_((D)p.efi.;9(D)) ^ 

P3 = Pr[A'e„e /Cenc ; ((D)p.efix) = 1] 

P4 = Pr[ife„c 

P5 = Pr[D Perm (At) : A° = 1] 

Let us explain the new notation. In the experiment deDning p2- AD; oracle, 
on query Af, responds by encrypting the Drst \M\ ® n bits of M using coins 
IV = 9{M)- In the experiment deDning p^, AD; oracle, on query M, responds by 
randomly encrypting the Drst \M\ ® n bits of A/. In the experiment deDning p^, 
AD; oracle, on query A/, responds by randomly encrypting a string of \M\ ® n 
random bits. 

Our goal is to upper bound Advp^(A) = pi® p^. We do this in steps. 
Claim. Pi ® p2 D Advg ^ (t' , g, D ) . 

Proof. Consider the following distinguisher D for G. It has an oracle for 9: A4 
{0, 1}". It picks A'enc Aenc- It runs A, and when A makes oracle query M 
it returns f Xenc (^'-IpreDx; 5(Af)) to A as the answer (where Mp^eUx is the Drst 
\M\ ® n bits of M.) Finally D outputs whatever A outputs. Then 

Pr[iFp,.f = 

Pr[^ Rand(AI, n) : = 1] = P2 

So AdvP‘'f(A') = Pi ® P 2 - The claim follows. 

Claim. p2 = P3. 

Proof. The only diDerence between the experiment underlying p2 and that un- 
derlying p3 is that in the former, the IV used for encryption is a random function 
of M, while in the latter it is chosen at random by the encryption algorithm. 
These are the same as long as all the oracle queries are diDerent, which is what 
we assumed about A. 
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Claim. p3 ® p4 D Adv5“''(t', (7. D ). 

Proof. Consider the following adversary B for S that is given an oracle O. It 
runs A, and when A makes oracle query AI it returns 0 (-/l/pi-eDx) to A as the 
answer (where Mpi-eUx is the Drst |Af|®n bits of M). Finally D outputs whatever 
A outputs. Then 

Pr[/W = 

Pr[/w /Cenc:S^"-i*'“') = l]=p4. 

So Adv^“''(S) = ps® P4,. The claim follows. 

Claim. p4 ® p5 D g^/2”. 

Proof. Let r = Pr[/i Rand(Ad) : = 1 ]. We argue that 7)4 ® r D (7^/2”+^ 

and also r®p5 D g^/ 2 "+^. The claim follows by the triangle inequality. It remains 
to prove the two subclaims. 

The second subclaim, that r®p5 D g^/2"+^, is of course clear; the statistical 
distance between a family of functions and a family of permutations is given by 
the collision probability under q queries. So consider the Drst subclaim, namely 
754 ® 7’ D . This is true because the encryption scheme is parsimonious. 

The IV is chosen at random, and for each Dxed IV, the map ((CjlpreDx; IV) 
is a permutation on At. Thus, p4 ® r is the statistical distance between a family 
of permutations on M. and a family of random functions on At, which is again 
^2^2”+! because all strings in At have length at least n. 

Given these claims, we can complete the proof of the theorem by noting that 
Adv 5 ^'P(A) = Pi ® 735 = (pi ® P2) + (P 2 ® Pz) + (P 3 ® Pi) + {Pi ® Ps) . I 

5 Comments and Open Problems 

Our security bound for VIL mode enciphering degrades with as do the bounds 
for other common modes of operation. It would be interesting to Dnd a method 
and analysis which had better quantitative security. 

It would be desirable to have a good constructions for a super variable- 
input-length cipher (again, starting with a block cipher). Following pl|, a super 
pseudorandom cipher F is one for which no reasonable adversary can do well 
at distinguishing a pair of oracles (FA(D), ^^^( 1 ])), for a random K G 1 C, from 
a pair of oracles (D (D), D®^([j)), for a random permutation D(D). This question 
has been investigated by Bleichenbacher and Desai, who point out that our VIL 
construction is not a super variable-input-length cipher, and they propose a 
construction for such a cipher 0. 

We have focussed on the case in which the message length is at least the 
blocklength n of the underlying block cipher. For shorter messages of even length 
2 D one can proceed as follows. First map the underlying enciphering key K into 
subkeys (A'enc A'pif, A'l, A'2, ■ ■ • , A'o) using standard key-separation techniques. 
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Now when |A/| D n, proceed according to VIL mode, using keys K^nc and A'prf. 
Bnt when |il/| < n encipher M rising an r-round Feistel network, keying the 
block-cipher-derived round function by We point ont that while such an 

approach may work well in practice, the bonnds one gets following uni and its 
follow-on work will be very weak for our purposes, since these bounds degrade 
as the blocklength shrinks and we are here imagining a blocklength of jnst a 
few bits. Tlrns enciphering very short messages in a provably-good way remains 
open. 

When this paper was presented at FSE Q)9, Mike Matyas described an al- 
ternative constrnction to encipher a message M: Drst, CBC-encrypt AI (with 
zero IV) to get a ciphertext TV; and then, to generate the ciphertext C, CBC- 
encrypt N, bnt starting from the its last black and working back towards the Drst 
block. A similar scheme is given in |H| . Ciphertext stealing can be used to handle 
inputs of length not a multiple of the blocklength. This sort of “forward-then- 
backwards” CBC sounds like an elegant approach, and it wonld be interesting 
to know if some version of it can be proven secure. 



Acknowledgments 

Many thanks to the anonymons reviewers of FSES)9, whose comments signiD- 
cantly improved our presentation. And thanks to Stefan Lucks and Ron Rivest 
for their comments on an earlier version of this work. 

Mihir Bellare was supported by NSF CAREER Award CCR-9624439 and 
a Packard Eoundation Fellowship in Science and Engineering. Phillip Rogaway 
was supported by NSP CAREER Award CCR-962540, and MICRO grants 97- 
150 and 98-129, funded by RSA Data Security, Inc., and ORINCON Corpora- 
tion. Much of PhilQi work on this paper was carried out while on sabbatical at 
Chiang Mai University, Thailand, hosted by the Computer Service Center, under 
Prof. Krisorn Jittorntrum and Prof. Darunee Smawatakul. 



References 

1. R. Anderson and E. Biham, “Two practical and provably secure block ciphers: 
BEAR and LION.” Proceedings of the 3rd Fast Software Encryption Workshop, 
Lecture Notes in Computer Science Vol. 1039, Springer- Verlag, 1996. 

2. M. Bellare and P. Rogaway, “Block cipher mode of operation for secure, 
length-preserving encryption.” US Patent ^5,673,319, September 30, 1997. Filed 
February 6, 1995. 

3. M. Bellare, A. Desai, E. Jokipii and P. Rogaway, “A concrete security tre- 
atment of symmetric encryption.” Proceedings of the 38th Symposium on Foun- 
dations of Computer Science, IEEE, 1997. 

4. M. Bellare, J. Kilian and P. Rogaway, “The security of cipher block chai- 
ning.” Advances in Cryptology - Crypto 94 Proceedings, Lecture Notes in Com- 
puter Science Vol. 839, Y. Desmedt ed., Springer- Verlag, 1994. 

5. D. Bleighenbacher and A. Desai, “A construction of a super-pseudorandom 
cipher.” Manuscript, February 1999. 



On the Construction of Variable-Inpnt-Length Ciphers 243 



6. O. Goldreich, S. Goldwasser and S. Micali, “How to constrnct random fun- 
ctions.” Journal of the ACM, Vol. 33, No. 4, 210-217, 1986. 

7. S. Goldwasser and S. Micali, “Probabilistic encryption.” Journal of Computer 
and System Sciences Vol. 28, 270-299, April 1984. 

8. R. Housley, “Gryptographic message syntax.” S/MIME Working Group of the 
IETF, Internet Draft draft-ietf-smime-cms-12.txt. March 1999. 

9. ISO/IEG 9797, “Information technology - Security techniques - Data integrity 
mechanism using a cryptographic check function employing a block cipher algo- 
rithm,” International Organization for Standardization, Geneva, Switzerland, 1994 
(second edition). 

10. M. Luby and C. Rackoff, “How to construct pseudorandom permutations from 
pseudorandom functions.” SIAM J. Computing, Vol. 17, No. 2, April 1988. 

11. S. Lucks, “Faster Luby-Rackoff ciphers.” Proceedings of the 3rd Fast Software 
Encryption Workshop, Lecture Notes in Computer Science Vol. 1039, Springer- 
Verlag, 1996. 

12. U. Maurer, “A simplified and generalized treatment of Luby-Rackoff pseudoran- 
dom permutation generators.” Advances in Cryptology - Eurocrypt 92 Procee- 
dings, Lecture Notes in Computer Science Vol. 658, R. Rueppel ed., Springer- 
Verlag, 1992, pp. 239-255. 

13. C. Meyer and M. Matyas, Cryptography: A New Dimension in Data Security. 
John Wiley & Sons, New York, 1982. 

14. S. Micali, C. Rackoff and R. Sloan, “The notion of security for probabilistic 
cryptosystems.” SIAM J. Computing, Vol. 17, No. 2, April 1988. 

15. M. Naor and O. Reingold, “On the construction of pseudorandom permuta- 
tions: Luby-Rackoff revisited.” Proceedings of the 29th Annual Symposium on 
Theory of Computing, ACM, 1997. 

16. National Bureau of Standards, FIPS PUB 46, “Data encryption standard.” U.S. 
Department of Commerce, January 1977. 

17. National Bureau of Standards, FIPS PUB 81, “DES modes of operation.” U.S. 
Department of Commerce, December 1980. 

18. S. Patel, Z. Ramzan and G. Sundaram, “Towards making Luby-Rackoff ci- 
phers optimal and practical.” Proceedings of the 6th Past Software Encryption 
Workshop, 1999. 

19. J. Patarin, “Improved security bounds for pseudorandom permutations.” Procee- 
dings of the Fourth Annual Conference on Computer and Communications Secu- 
rity, AGM, 1997. 

20. J. Patarin, “About Feistel schemes with six (or more) rounds.” Proceedings of 
the 5th Fast Software Encryption Workshop, Lecture Notes in Gomputer Science 
Vol. 1372, Springer- Verlag, 1998. 

21. E. Petrank and G. Rackoff, “CBC MAC for real-time data sources.” Ma- 
nuscript, available at http://philby.ucsd.edu/cryptolib.html, 1997. 

22. J. PlEPRZYK, “How to construct pseudorandom permutations from single pseu- 
dorandom functions.” Advances in Cryptology - Eurocrypt 90 Proceedings, Lec- 
ture Notes in Computer Science Vol. 473, I. Damgard ed.. Springer- Verlag, 1990 
pp. 140-150. 

23. R. Revest, “All-or-nothing encryption and the package transform.” Proceedings of 
the 4th Fast Software Encryption Workshop, Lecture Notes in Computer Science 
Vol. 1267, Springer- Verlag, 1997. 




244 M. Bellare and P. Rogaway 



24. D. Wagner, B. Schneier and J. Kelsey, “Cryptanalysis of the cellular message 
encryption algorithm.” Advances in Cryptology - Crypto 97 Proceedings, Lec- 
ture Notes in Computer Science Vol. 1294, B. Kaliski ed.. Springer- Verlag, 1997, 
pp. 526-537. 

25. Y. Zheng, T. Matsumoto and H. Imai, “Impossibility results and optimality 
results on constructing pseudorandom permutations.” Advances in Cryptology - 
Eurocrypt 89 Proceedings, Lecture Notes in Computer Science Vol. 434, J-J. Quis- 
quater, J. Vandewille ed.. Springer- Verlag, 1989. 




Slide Attacks 



Alex Biryukov* and David Wagner** 



Abstract. It is a general belief among the designers of block-ciphers 
that even a relatively weak cipher may become very strong if its number 
of rounds is made very large. In this paper we describe a new gene- 
ric known- (or sometimes chosen-) plaintext attack on product ciphers, 
which we call the slide attack and which in many cases is independent 
of the number of rounds of a cipher. We illustrate the power of this 
new tool by giving practical attacks on several recently designed ciphers: 
TREYFER, WAKE-ROFB, and variants of DES and Blowfish. 



1 Introduction 

As the speed of computers grows, fast block ciphers tend to use more and more 
rounds, rendering all currently known cryptanalytic techniques useless. This is 
mainly due to the fact that such popular tools as differential linear ana- 

lysis H31 are statistic attacks that excel in pushing statistical irregularities and 
biases through surprisingly many rounds of a cipher. However any such approach 
finally reaches its limits, since each additional round requires an exponential ef- 
fort from the attacker. 

This tendency towards a higher number of rounds can be illustrated if one 
looks at the candidates submitted to the AES contest. Even though one of the 
main criteria of the AES was speed, several prospective candidates (and not 
the slowest ones) have really large numbers of rounds: RC6(20), MARS(32), 
SERPENT(32), CAST(48). This tendency is a reflection of a belief/empirical 
evidence that after some high number of rounds even a relatively weak cipher 
becomes very strong. It is supported by the example of DES, where breaking 
even 16 rounds is already a very hard task, to say nothing about 32-48 rounds 
(e.g. double- or triple-DES). Thus for the cryptanalyst it becomes natural to 
search for new tools which are essentially independent of the number of rounds 
of a cipher. The first step in this direction can be dated back to a 1978 paper by 
Grossman and Tuckerman P|, which has shown how to break a weakened Feistel 
cipheiQ by a chosen plaintext attack, independent of the number of rounds. 
We were also inspired by Biham’s work on related-key cryptanalysis 0, and 
Knudsen’s early work m- 

* Applied Mathematics Department, Technion - Israel Institute of Technology, Haifa, 
Israel 32000. Email: albi@cs.technion.ac.il 

** University of California, Berkeley. Email: daw@cs.berkeley.edu 
^ An 8-round Feistel cipher with eight bits of key material per round used to swap 
between two S-boxes So and in a Lucifer-like manner. A really weak cipher by 
modern criteria. 
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In this paper we introduce a new class of generic attacks which we call slide 
attacks together with a new set of cryptanalytic tools applicable to all product 
(mainly iterative) ciphers and even to any iterative (or recursive) process over the 
finite domain (stream ciphers, etc.). Such attacks apply as soon as the iterative 
process exhibits some degree of self-similarity and are in many cases independent 
of the exact properties of the iterated round function and of the number of 
rounds. 

While the two other generic cryptanalytic attacks — differential and linear 
analysis — concentrate mainly on the propagation properties of the encryption 
engine (assuming a strong key-scheduling which produces independent subkeys), 
the degree of self-similarity of a cipher as studied by slide attacks is a totally diffe- 
rent aspect. Depending on the cipher’s design, slide attacks range from exploiting 
key-scheduling weaknesses to exploiting more general structural properties of a 
cipher. The most obvious version of this attack is usually easy to prevent by 
destroying the self-similarity of an iterative process, for example by adding ite- 
ration counters or fixed random constants. However more sophisticated variants 
of this technique are harder to analyze and to defend against. 

We start by analyzing several block ciphers that decompose into r iterations 
of a single key-dependent permutation Fi. We call such ciphers homogeneous. 
This usually arises when the key-schedule produces a periodic subkey sequence, 
when Fi = Fj for all i = j mod p where p represents the period. In the simplest 
case, p =1 and all round subkeys are the same. We call these attacks self-related 
key attacks, since they are essentially a special case of related-key attacks P|. 
Note, however, that our attacks require only a known- (or sometimes chosen-) 
plaintext assumption and thus are much more practical than most related key 
attack^. For the case of block ciphers operating on a n-bit block, the complexity 
of slide attacks (if they work) is usually close to 0(2"/^) known plaintexts. For 
Feistel ciphers where the round function Fj modifies only half of the block, there 
is also a chosen-plaintext variant which can often cut the complexity down to 
0(2"/^) chosen texts. 

A somewhat less expected observation is that schemes relying on key-depen- 
dent S-boxes are also vulnerable to sliding. In general, autokey ciphers and data- 
dependent transformations are potentially vulnerable to such attacks. We sum- 
marize our results in Tabled 

This paper is organized as follows. In Section 0 we describe the details of a 
typical slide attack, and in Section 0 we show how the attacks can be optimized 
for Feistel ciphers. We then proceed with an introductory example: a 96-bit DES 
variant with 64-rounds, which we call 2K-DES, Section^. The next two sections 
are devoted to cryptanalysis of several concrete cipher proposals: Section0breaks 
TREYFER, a cipher published in FSE’97, and Section 0 analyzes stream cipher 
proposals based on WAKE presented at FSE’97 and FSE’98. Finally, Section Q 
shows slide attacks on ciphers with key-dependent S-boxes, focusing on a variant 
of Blowfish with zero round subkeys. 



^ However, Knudsen’s early work on LOKI91 m showed how to use a related-key-type 
weakness to reduce the cost of exhaustive keysearch using only chosen plaintexts. 
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Table 1. Summary of our attacks on various ciphers. 



Cipher 


(Rounds) Key Bits 


Our Attack 




Data Complexity Time Complexity 


Blowfish^ 


(16) 


448 


2^ 'CP 


2^' 


Treyfer 


(32) 


64 


232kp 


244 


2K-DES 


(64) 


96 


2'^'^ACP 


233 


2K-DES 


(64) 


96 


232kp 


250 


WAKE-ROFB 


(fc) 


32n 


2'^'^CR 


232 



^ - Modified variant, without round subkeys. KP — known-plaintext, CP — chosen- 
plaintext, AGP — adaptive chosen-plaintext, CR — chosen-resynchronization (IV). 



2 A Typical Slide Attack 

In Figure n we show the process of encrypting the n-bit plaintext Xq under a 
typical product cipher to obtain the ciphertext X^. Here Xj denotes the interme- 
diate value of the block after j rounds of encryption, so that Xj = Fj{Xj-i,kj). 
For the sake of clarity, we often omit k by writing F{x) or Fi{x) instead of 
F{x, k) or Fi{x, k). 




Fig. 1. A typical block cipher 



As we mentioned before, the attack presented in this note is independent 
of the number of rounds of the cipher, since it views a cipher as a product 
of identical permutations F(x,k), where /c is a fixed secret key (here F might 
include more than one round of the cipher). Moreover its dependence on the 
particular structure of F is marginal. The only requirement on F is that it is 
very weak against known-plaintext attack with two plaintext-ciphertext pairs. 
More specifically, we call F a weak permutation if given the two equations 
F{xi,k) = yi and F{x 2 ,k) = ?/2 it is “easy” to extract the key k. This is 
informal definition since the amount of easiness may vary from cipher to cipher. 
We can show that 3 rounds of DES form a weak permutatioiH. One and a half 
round IDEA is also weak. 

^ For F = three rounds of DES, with DES keyschedule one may consider 4-bit output 
of specific S-box at the 1st and 3rd rounds. This gives a 4-bit condition on the 6 
key bits entering this S-box at the 1st and on 6 bits entering this S-box at the 3rd 
round. Using similar observations it is possible to extract the full DES 56-bit key in 
time faster than that of one DES encryption. 
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We next show in Figure 0 how a slide attack against such a cipher might 
proceed. The idea is to “slide” one copy of the encryption process against another 
copy of the encryption process, so that the two processes are one round out of 
phase. We let Xq and Xq denote the two plaintexts, with Xj = Fj{Xj-i) and 
Xj = Fj{Xj_^). With this notation, we line up Xi next to X'q, and Xj+i next 
tax'. 




Fig. 2. A typical slide attack 



Next, we suppose that Fj = Fj+i for all j > 1; this is the assumption required 
to make the slide attack work. In this case, all the round functions are the same, 
so for the remainder of this section we will drop the subscripts and simply write 
F for the generic round transform. 

The crucial observation is that if we have a match X\ = Xq, then we will 
also have Xr = X'_^. The proof is by induction. Suppose that Xj = X'_^. Then 
we may compute Xj+i = F{Xj) = F{Xj_^) = F{Xj_^) = Xj, which completes 
the proof. Therefore, we call a pair (P,C), {P' ,C) of known plaintexts (with 
corresponding ciphertexts) a slid pair if F{P) = P' and F{C) = C . 

With this observation in hand, the attack proceeds as follows. We obtain 
2"/^ known texts {Pi,Ci), and we look for slid pairs. By the birthday paradox, 
we expect to find about one pair of indices i,i' where F{Pi) = Pii, which gives 
us a slid pair. 

Furthermore, slid pairs can often be recognized relatively easily. In general, 
we recognize slid pairs by checking whether it is possible that F{Pi) = Pii and 
F{Ci) = Ci' both hold for some key. When the round function is weak, we are 
assured that this condition will be easy to recognize. Once we have found a slid 
pair, we expect to be able to recover some key bits of the cipher. If the round 
function is weak, we can in fact recover the entire key with not too much work. 
In general, we expect a single slid pair to disclose about n bits of key material; 
when the cipher’s key length is longer than n bits, we may use exhaustive search 
to recover the remainder of the key, or we may alternatively obtain a few more 
slid pairs and use them to learn the rest of the key material. 

Let us summarize the attack. For a cipher with n-bit blocks and repeating 
round subkeys, we need about 0(2"/^) known plaintexts to recover the unknown 
key. While a naive approach will require 0(2”) work, much faster attacks are 
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usually possible by exploiting the weaknesses in the F function. This technique 
applies to a very wide class of round functions. 



3 Slide Attacks on Feistel Ciphers 

In this section, we show how the slide attack can be optimized when it is applied 
to a Feistel cipher. 

Known-plaintext attacks. In the case of Feistel ciphers, the round fun- 
ction F{{l,r)) = (r © f{l),l) modifies only half of its input. Therefore, the 
condition F{x) = x' can be recognized by simply comparing the left half of x 
against the right half of x', and this filtering condition eliminates all but 2“"/^ 
of the wrong pairs. 

This improved filtering allows us to reduce the time complexity of attack 
under the known-plaintext threat model to 2"/^ known texts and 2"/^ offline 
work. We have a n-bit filtering condition on the potential slid pairs, for if (P^, C\) 
forms a slid pair with (P-,Cj) then we have P(Pi) = Pj and F{Ci) = C'. 
Therefore, potential slid pairs can be identified using a lookup table (or sorted 
list) with 2"/^ entries: we sort the known text (Pi,0 based on the left halves 
of Pi and Ci, and for each j we look for a match with the right halves of Pj and 

. . . 

With this technique, we expect to find one good slid pair along with only 
one false alarm; false matches can be easily eliminated in a second phase. The 
slid pair gives us about n bits of information on the key; if this does not reveal 
all of the key material, we can look for a few more slid pairs or search over the 
remaining unknown key bits. 

Chosen-plaintext attacks. For Feistel ciphers, the data complexity can 
be reduced further to about 2"/^ texts when chosen plaintext queries are avai- 
lable. The key to the reduction in texts is the use of carefully-chosen structures. 
(This technique was first pioneered by Biham in his work on related-key crypt- 
analysis |2| .) Fix an arbitrary n/2-bit value x. We choose a pool of 2"/^ plaintexts 
Pi = (x. Hi) by varying over 2"/"^ random values for y, and then choose a second 
pool of 2"/^ texts of the form P' = {y'j,x) by varying over another 2"/^ random 
choices for y' . This provides 2"/^ pairs of plaintexts, and a right pair occurs with 
probability 2“"/^ (namely, when /(x) = yi(By'j), so we expect to find about one 
slid pair. This slid pair can be recognized using the n/2-bit filtering condition 
on the ciphertexts, and then we can use it to recover n bits of key material as 
beforcQ 

Probable-plaintext attacks. When plaintexts contain some redundancy, 
the data complexity of a known-plaintext slide attack can often be significantly 
reduced. Our techniques are derived from Biryukov and Kushilevitz’s recent work 
on exploiting such plaintext redundancy in differential attack 0 . 

^ Notice that if we deal with an unbalanced Feistel cipher, the effect of a chosen 
plaintext attack can be even greater. For example for a Skipjack-like construction 
with the same keyed permutation in all rounds, a chosen plaintext attack with only 
2"^® time and data is possible. 
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Consider first a very simple model: the plaintext source emits blocks where 
the four most significant bits of each byte are always zero, so that the resulting 
n-bit plaintexts have only n/2 bits of entropy. In this case, one can mount a 
slide attack with about 2^"/® ciphertexts, which is midway between the data 
complexities of chosen-plaintext slide attacks (2"/"* texts) and known-plaintext 
slide attacks (2"/^ texts, for uniformly-distributed plaintexts). 

The observation is that for any fixed value x that can occur as the left half 
of a plaintext, we expect to see about 2®”/®“”/^ = 2”/® plaintexts of the form 
Pi = (x,yi), along with another 2”/® plaintexts of the form Pj = {y'j,x). Each 
X gives about 2”/^ pairs of texts, and there are 2”/^ values for x. Assuming / 
behaves randomly, any such pair gives a 2“"/^ chance of forming a slid pair, so 
in total we expect to find about one slid pair among all the 2®"/® ciphertexts. 

This attack can even be converted to a ciphertext-only attack with a slight 
increase in complexity. Suppose the condition f(u) = v, f{u') = v' uniquely 
identifies the key, and key recovery from u,u',v,v' takes 0(1) time. Then we 
can find the key with 2®"/®+® ciphertexts and 0(2"/^) work. First, we note that 
the n/2-bit filtering condition on the ciphertexts gives a set of 2 ” 0+2 potential 
slid pairs, of which about four are correct (the rest are false matches). The list of 
potential slid pairs can be identified with 0(2®"/®) steps by hashing or sorting. 
Next, we make a guess at a correct slid pair 0^,0). Third, for each remaining 
potential slid pair Oj/ , O', , we compute the key value suggested by the equations 
F{Ci) = Ci',F{Cj) = O',, and store this n-bit key value in a table. We search for 
collisions in this table (by hashing or sorting). If our guess at Oj, O' indeed gave 
a correct slid pair, the right key value will be suggested three times. On the other 
hand, the birthday paradox assures us that wrong key values will be suggested 
only with negligible probability. This yields an attack that takes 0(2"/^) time 
(2"/® guesses at 0i,0', performing 0(2”/®) operations per guess to build the 
table) and needs 2"/®+^ space and about 2®”/®+^ ciphertexts. 

Of course, this is only an example. The exact complexity of the probable- 
plaintext and ciphertext-only slide attacks can vary widely: some plaintext distri- 
butions increase the complexity of slide attacks (or even render them impossible), 
while others reduce the complexity substantially. In general, the expected num- 
ber of texts needed to find the first slid pair is approximately 2"/®(^^Pr[r = 
x]Pr[^ = a;])“®/^ (under heuristic assumptions on /), although the exact details 
of the attack will depend intimately on the distribution of the plaintexts. 

4 Modified DES Example: 2K-DES 

The following constitutes in our opinion a nice problem for a student crypto- 
course or an introductory crypto-textbook. Suppose one proposes to strengthen 
DES in the following way. One increases the number of rounds from 16 to 64, 
and extends the number of key-bits from 56 to 96 in the following simple way: 
given two independent 48-bit keys Ki,K 2 one uses Ki in the odd rounds and K 2 
in the even rounds instead of DES subkeys. This version is obviously immune 
to exhaustive search. The conventional differential and linear attacks probably 
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will also fail due to the increased number of rounds. The question is: “Is this 
cipher more secure than DES?” Below we show two attacks on this cipher which 
use the symmetry of the key-scheduling algorithm and are independent of the 
number of rounds. 

One very simple way to attack such cipher is as follows. For any known 
plaintext-ciphertext pair (P, C), decrypt ciphertext C one round under all pos- 
sible 2^^ outputs from the / function in the last round. For each of the 2^^ 
resulting texts C", request the encryption P' = Ek{C). This is equivalent to 
decryption all way back to the plaintext P and further by one more round to 
F~^{P, K 2 ) = P' ■ Since F preserves 32 bits of its input, one can check a 32-bit 
filtering condition over P, P' to eliminate essentially all of the wrong guesses at 
C . When we find a C',P' which survives the filtering condition, we can derive 
K 2 easily from the equations F{P',K 2 ) = P, F{C',K 2 ) = C (here F includes 
the Feistel swap of the halves). This procedure leaves only the correct value of 
K 2 with high probability. Now Ki can be found by exhaustive search; or, for a 
more efficient attack, we can peel off the first round using the known value of PT 2 , 
and repeat the attack once more on the resulting cipher to learn K\. This simple 
attack uses one known-plaintext (P, C) pair, 2^^ adaptive chosen plaintexts and 
2^^ time. A similar attack will actually work for any “almost” -symmetric key- 
scheduling; see also 0 for another example of this type of attack. Notice that 
if the number of rounds r is odd and key-scheduling is symmetric then double 
encryption with such a Feistel-cipher becomes an identity permutation. 

This attack can be improved using the ideas of the present paper. By applying 
slide techniques, we can show that this cipher is much weaker than one would 
expect even when its number of rounds r is arbitrarily large, and that attacks are 
available even under the more practical known-plaintext threat model. For any 
fixed value of Ki , K 2 this cipher can be viewed as a cascade of | identical fixed 
permutations. Thus given a pool of 2^^ known plaintexts, one can recover all 96 
bits of the secret key just by checking all the possible pairs in about 2®^/64 = 2®^ 
naive steps (each step is equivalent to one 2K-DES encryption operation) . Each 
pair of plaintexts (P,P*) suggests 2^® candidates for Ki and 2^® candidates for 
K 2 which are immediately checked against a pair of corresponding ciphertexts 
{C, C*). Thus on the average after this process we are left with a few candidate 
96-bit keys which can be further checked with trial encryption. Using a more 
sophisticated approach (ruling out many pairs simultaneously) it is possible to 
reduce the work factor considerably. For each plaintext we guess the left 24 bits 
of Ki, which allows us to calculate 16-bits of the S-box output and thus 16-bits 
of the possible related plaintext and 16-bits of related ciphertext. This gives a 
32-bit condition on the possible related plaintext/ciphertext pair; then analyzing 
the pool of texts will take a total of 2^^ x 2^^/64 = 2®° steps. 



5 TREYFER 

In this section we apply slide attacks to cryptanalyze TREYFER, a block- 
cipher/MAC presented at FSE’97 by Gideon Yuval and aimed at smart-card 
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applications. It is characterized by a simple, extremely compact design (only 29 
bytes of code) and a very large number of rounds (32). We show an attack on 
TREYFER that is independent of the number of rounds and exploits the sim- 
plicity of key-schedule of this cipher. It uses 2^^ known-plaintexts and requires 
2^“^ time for analysis. 



Description of TREYFER 

TREYFER is a 64-bit block cipher/MAC, with a 64-bit key, designed for a very 
constrained architectures (like a 8051 CPU with 1KB flash EPROM, 64 bytes 
RAM, 128 bytes EPROM and peak IMHz instruction rate) . The algorithm is as 
follows: 

for(r=0; r < NumRounds; r++){ 
text [8] = text [0] ; 
for(i=0; i<8; i++) 

text[i+l] = (text[i+l] + Sbox [(key [i] +text [i] )°/o256] )<<< 1; 
//rotate 1 left 
text [0] = text [8] ; 

} 



Here text is an eight-byte plaintext, key is an eight-byte key, S-box denotes 
an 8x8-bit S-box chosen at random, and NumRounds stands for 32 rounds. After 
32 rounds of encryption text contains eight-byte ciphertexts. One of the mo- 
tivations behind the design of this cipher was that in spite of the simplicity of 
the round function a huge number of rounds (32) will make any possible attack 
impractical. 

As an aside (without any connection to our attack), we observe that TREY- 
FER exhibits much weaker diffusion in the decryption direction: it takes two 
rounds for a one-byte difference to influence all eight bytes in the encryption 
direction, but it takes seven rounds in the decryption direction. 



Our Attack on TREYFER 

The idea of our attack is very similar to the related-key attacks I2C2I, however 
our attack is known-plaintext and not chosen-key like the attacks in 0. 

In our attack we use the fact that due to hardware constraints the designers 
of TREYFER sacrificed a proper key-scheduling to make a more compact and 
faster cipher. Thus key-scheduling of TREYFER simply uses its 64-bit key K 
byte by byte. This is done exactly in the same fashion at each round. 

However the simplicity of key-schedule causes TREYFER to be a cascade 
of 32 identical permutations! Thus suppose that two plaintexts P and P* are 
encrypted by TREYFER to C and C* . Denote the intermediate encrypted values 
after each round by Pi, ... , P32, where P32 = C. Denote the round encryption 
function of TREYFER by F. Now, if two plaintexts are related by a one-round 
encryption as P(P, K) = P* then it must be that the same relation holds for 
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the ciphertexts F(C, K) = C* . Due to simplicity of the round function F, given 
a properly related pair the full 64-bit key K of TREYFER can be derived either 
from equation F{P,K) = P* or from equation F{C,K) = C*. If P, P* is a 
properly related pair both equations suggest the same value of the key. However 
if the pair is not properly related there is no reason for the two keys to be equal. 

Thus on TREYFER with arbitrary number of rounds and with arbitrarily 
chosen S-box it is possible to mount an attack with about 2^^ known plain- 
texts and in the time of 2'^'^ offline TREYFER encryptions (performed on the 
attacker’s computer and not on the slow smart-card processor). Due to the bir- 
thday paradox a pool of 2^^ known plaintexts will contain a properly related pair 
with high probability. Thus a naive approach is to try all the possible 2®^ pairs, 
and each time the two equations F{P,K) = P* and F{C,K) = C* suggest the 
same 64-bit key, check this candidate key with trial encryption. Since for each 
pair we perform 1/16 of the TREYFER encryption, the overall complexity of 
this naive attack is 2^® TREYFER encryptions, which is still faster than exhau- 
stive search. However we can do better than that if for each plaintext we do 
216 = 2® • 2® guesses of the two subkeys k[7] ,k[0]. For each guess we arrive 
at a 32-bit condition on the possible co-related plaintext. Thus on the average 
only one out of 2®^ plaintexts passes the 32-bit condition and it can be easily 
found in a sorted array of plaintexts. Then the newly formed pair is checked for 
the version of the full 64-bit key as it was done in a naive approach. The time 
required by the analysis phase of this attack is equivalent to 2i® • 2®^ • ^ = 2^^ 
TREYFER encryptions. 

Thus we have shown an attack on TREYFER, with 2®^ known plaintexts, 
2'^* time of analysis and 2®^ memory. The interesting property of this attack is 
that it is independent of the number of rounds and of the exact choice of the 
S-box. This attack seems to be on the verge of practicality, due to very slow 
smart-card encryption (6.4 msec per block) and very slow communication wire 
(lOKBPS) speed. However this task is easily parallelizable if an attacker obtains 
many smart-cards containing the same secret key. Once the attacker receives the 
data, the analysis can be done in a few days on an average computer. 

It should be possible to make TREYFER immune to this attack by adding 
a more complex key-schedul^H. 

6 Stream Ciphers and WAKE-ROFB 

It is also possible to mount slide attacks against stream ciphers. We show how 
to attack the re-synchronization mechanism used in WAKE- ROFB0 0 , a recent 
WAKE variant proposed at FSE’98. Our attacks work only under restrictive 
assumptions on the IV selection and re-synchronization mechanism. 

® Following the results of this paper round counters were introduced into the round 
function of TREYFER, as a counter-measure against such attacks El- 
® WAKE-ROFB is a refinement of a design originally proposed at FSE’97 0. In this 
paper, we analyze only the FSE’98 scheme; the FSE’97 cipher’s re-synchronization 
mechanism appears to protect it from slide attacks. 
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Note that this does not reflect poorly on the core of the WAKE-ROFB design; 
it merely shows that dealing with re-synchronization can be tricky, because it 
introduces the possibility of chosen-text attacks. (See also |iSllbl5lti| .') In short, 
WAKE-ROFB is not broken. We point out these attacks merely to illustrate the 
intriguing theoretical possibility of applying slide attacks to stream ciphers. 

WAKE-ROFB is a stream cipher with 32n bits of internal state, organi- 
zed into n 32-bit words. The words are updated via a simple analogue of a 
non-linear feedback shift register, extended to operate on words instead of bits. 
Writing i?i, . . . , for the state registers, WAKE-ROFB’s state update function 
is defined as 



R'l Rn-l + F{Rn)\ Rj Rj-l'-: Rl ^ R'l' 

Here F : is a bijective key-dependent nonlinear function. Every fc-th 

time we step the register, we output the value of as the next word of the 
key-stream. See Figure El for a pictorial illustration of the cipher. 




Fig. 3. The WAKE-ROFB stream cipher 



The parameters k and n may be varied to suit performance and security 
needs. However, suggests two concrete proposals: {n,k) = (5,8) and {n,k) = 
(4,4). For the n = 5 proposal, a concrete scheme for loading an initialization 
vector is proposed: the 64-bit IV (A, B) is loaded into the registers as i?i = i ?4 = 
i ?5 = A, R 2 = Rs = B, and then 8 words of output are generated and discarded. 
For the n = 4 proposal, no scheme for loading an IV was suggested. 

Note that, to support re-synchronization, WAKE-ROFB is built around a 
mode of operation that is perhaps somewhat unusual for a stream cipher. Many 
published stream cipher constructions use a public feedback function and load 
their initial state from the key, and often no allowance is made for re-synchro- 
nization. In contrast, WAKE-ROFB is keyed solely by the choice of the key- 
dependent permutation F, and the initial state of the register is loaded from a 
publicly-known IV0. Re-synchronization in WAKE-ROFB is easily accomplished 
by choosing a new IV. 

The main observation is that this construction can be viewed as roughly 
an unbalanced Feistel cipher (with round function F) that outputs one word 

^ But note that slide attacks do not always require knowledge of the initial state of 
the register. For instance, some of our attacks would still be possible even if the 
construction were modified to load the initial state of the register as e.g. the Triple- 
DES-CBC decryption of the IV under some additional keying material. 
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every k rounds. From this viewpoint, there is no round-dependence in the round 
transformation. Since Feistel ciphers with no round-dependence are susceptible 
to slide attacks, it seems natural to suspect that slide attacks may also prove 
useful against the WAKE-ROFB stream cipher. This is indeed the case. 

First, as a warmup exercise, we note that when the attacker has full control 
over the initial state of the stream cipher, simple slide attacks are often available. 
The attack is the same as a chosen-plaintext slide attack on a Feistel cipher with 
constant round subkeys. We fix ri, . . . ,r„_i, and generate 2^® IV’s of the form 
IVx = (ri, . . . ,r„_i, AT) by varying X. We also generate 2^® IV’s of the form 
IVy = (V, ri, . . . , Tn-i) by varying Y . Note that if r„_i -I- F{X) = V, we will 
have a successful slide relation between the key-stream generated by IVx and 
the key-stream generated by IVy- For such V, V, the resulting internal states 
will be closely related: if we let S'a[t] = . . . , Rn,a[t\) be the 32n-bit state 

generated from IVa by stepping the cipher t times, then Sy [t] = Sx [t + 1] for 
all t. 

In many cases, this condition can be easily recognized, because the key- 
streams will be highly related to each other. For instance, for the (n, k) = (4, 4) 
proposal, if we know the key-stream outputs from IVx at times jk, (j + l)k and 
the key-stream output from IVy at time jk, we can deduce one input-output 
pair for the F function for each time step; this property allows us to easily 
recognize slid pairs with about 8 known outputs for the F proposed in pjf|. 
Analysis is apparently more difficult when gcd(n, k) = 1, but attacks are still 
available (albeit with increased data requirements) by choosing n-2®^ IV’s of the 
form (Y, . . . , V, r, . . . , r); the crucial observation is that (r, ... ,r) forms a slid pair 
with (F(r) + r, r, . . . , r), which forms a slid pair with (F(r) + r, F(r) + r,r, . . . ,r), 
and so on. 

We conclude that a slide attack may be possible with as few as 2^^ streams 
(each containing at least 8 known outputs), when the attacker has full control 
over the initial state of the register. This situation might occur if, for instance, 
the IV-loading mechanism simply loaded the initial state of the register directly 
as the value of a n-word IV, since then an attacker would be able to control the 
initial state directly with a chosen-IV chosen-ciphertext attack. One corollary is 
that, to prevent slide attacks, the IV-loading mechanism must be carefully desi- 
gned. Note that the WAKE-ROFB design precludes these attacks by explicitly 
specifying a resynchronization mechanism that allows only limited control over 
the initial state of the cipher. 

Even when the attacker has no control over the initial state of the register, 
known-IV slide attacks may still be possible. By analogy to the standard known- 
text attacks on block ciphers, we expect to find one successful slide relation after 

® This is because 0 constructs the T table from two 4 x 16-bit lookup tables, and by the 
birthday paradox after 7 observations of a 4-bit value we expect to see a collision or 
two. But even for more sophisticated constructions of the F function, the number of 
known outputs needed would not increase substantially. With a randomly generated 
T table, about 40 known outputs would suffice; even if the entire function F were 
chosen randomly, known outputs should be enough to detect slid pairs. 
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examining about known text streams, and in some cases this might enable 

successful cryptanalysis of the cipher. One simple defense is to increase the size 
of the internal state enough so that the data requirements become infeasible. 

Finally, we consider the concrete IV-loading scheme proposed in ^ for the 
(n, fc) = (5,8) WAKE-ROFB cipher. There the 64-bit IV {A,B) is loaded into 
the registers as (i?i, . . . , i?s) = (A, B, B, A, A), and then 8 words of output are 
generated and discarded. 

We note that a slide attack on this scheme is still possible, when 2^^ chosen- 
IV queries are available. We obtain known key-stream output for the 2^^ IV’s of 
the form (A, A). This loads the initial state of the registers with (i?i, . . . , R^) = 
(A, . . . , A). Note that when F{A) = 0, we will have R[ = A, and so stepping 
the initial state (A, . . . , A) gives the state (A, . . . , A). In other words, for A = 
F~^{0), we obtain a cycle of period one. This can be easily recognized from a 
short stretch of known key-stream output, and allows allows us to obtain 32 bits 
of information on the key. 

It is clear that the design of a secure IV-loading mechanism for WAKE- 
ROFB-like stream ciphers is non-trivial. Certainly running the cipher for 8k 
time steps and discarding the outputs helps stop some attacks, but as we have 
shown, it is not always sufficient. 

Therefore, we propose the following design principle for such stream cipher 

Whenever possible, the feedback function should 
include some form of round-dependence. 



7 Key-Dependent S-Boxes: A Variant of Blowfish 

The following was inspired by a paper due to Grossman and Tuckerman ^ from 
1978. In this section we show by using a more modern techniques that if the 
only strength of a cipher comes from key-dependent S-boxes (with no round 
dependence) then such cipher can be attacked easily using slide attacks. This 
shows that slide attacks are not restricted to ciphers with weak key-scheduling 
algorithms. 

For an example of how this might work consider a cipher called Blowfish, 
which was designed by Bruce Schneier M- This is a Feistel cipher with 64-bit 
block, 16 rounds and up to 448 bits of the secret key. These are expanded into a 
table consisting of four S-boxes from 8 to 32 bits (4096 bytes total). S-boxes are 
key-dependent and unknown to the attacker. Also in each round a 32-bit subkey 
Pi is xORed to one of the inputs. At the end two 32-bit subkeys Pn and Pis 
are xORed to the output of a cipher. See Figure Elfor a picture of one round of 
Blowfish. So far no attacks are known on a full version of this cipher. 

® Following the results of this paper, round counters were introduced into the re- 
synchronization mechanism of WAKE-ROFB as a counter-measure against such 
attacks |Z|. 
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32 bits 32 bits 




Fig. 4. One round of Blowfish. 



The best previous result US! is a differential attack on Blowfish with known 
S-boxes which can find the Pi array using 2®’'+^ chosen plaintexts, where r stands 
for the number of rounds. For certain weak keys that generate bad S-boxes (1 
out 2^"* keys) the same attack requires chosen plaintexts (still completely 

ineffective against 16-round Blowfish). 

Assume that all the Pi's are equal to zero. In this case one may notice that all 
rounds of a cipher perform the same transformation which is data-dependent. 
Thus given a 32-bit input to the f-function the output of the F function is 
uniquely determined. Also only 16 bytes out of 4096 take part in each evaluation 
of the F’-function. Thus one naive approach will be to fix a plaintext P, guess 
all these 128-bits of the key and partially encrypt P with the guessed keys one 
Feistel-round, and then perform a slide attack for P and for the guessed text. 
A less naive approach is to guess the 32-bit output of the F-function and thus 
obtain a correct encryption with one Feistel round in 2®^ steps, checking if the 
guess was correct with a usual sliding technique. An even better approach is 
to encrypt two pools of chosen plaintexts {X, Pn) and (Pr, Y), where X and Y 
both receive 2^® random values and Pr is fixed. Thus with high probability there 
is an element (Pr, F)) in the second pool which is an exact one-round encryption 
of some element {Xj,Pji) from the first pool. Such pair can be easily detected 
by sliding considerations (especially if we repeat this experiment with the same 
value of Pr and other random values of X and Y). 

Each slid pair provides us with about 64 bits of key-dependent S-box infor- 
mation (two equations for P-function). Thus with about 500 probes of this type 
it is possible to find all four S-boxes. Data for this attack can be packed into 
structures efficiently. Thus we have a very simple slide attack with only about 
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29 , 218 _ 2^7 chosen plaintexts on this variant of Blowfish. Similar techniques 
can be applied to any iterative autokey cipher. 

This attack is independent of the number of rounds of a cipher (be it 16 or 
16000 rounds), of the exact structure of the F-function, of the key length, and of 
the key-schedule, no matter how complex is the S-box generation orocesa^ This 
shows that slide attacks are not restricted to ciphers with weak key-scheduling. 
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Abstract. CS-Cipher is a block cipher which has been proposed at FSE 
1998. It is a Markov cipher in which diffusion is performed by multiper- 
mutations. In this paper we first provide a formal treatment for differen- 
tial, linear and truncated differential cryptanalysis, and we apply it to 
CS-Cipher in order to prove that there exists no good characteristic for 
these attacks. This holds under the approximation that all round keys of 
CS-Cipher are uniformly distributed and independent. For this we intro- 
duce some new technique for counting active Sboxes in computational 
networks by the Floyd- Warshall algorithm. 



Since the beginning of modern public research in symmetric encryption, block 
ciphers are designed with fixed computational networks: we draw a network and 
put some computation boxes on. The Feistel scheme m is a popular design 
which enables to make an invertible function with a random function. Its main 
advantage is that decryption and encryption are fairly similar because we only 
have to reverse the order of operations. 

Another popular (and more intuitive) design consists of having a cascade of 
computational layers, some of which implement parallel invertible transforma- 
tions. (People inappropriately call it the “SPN structure” as for Substitution 
Permutation Network, as opposed to Feistel schemes. Referring to Adams’ The- 
sis PI, several Feistel schemes are also SPN ones.) For this we need two different 
implementations for encryption and decryption. Several such designs have been 
proposed to the Advanced Encryption Standard process: Serpent, Safer-b, Rijn- 
dael and Crypton (seeJ0). In this paper we focus on CS-Cipher P3I iii order to 
investigate its securityjj 

The main general known attacks are Biham and Shamir’s differential cryp- 
tanalysis PI and Matsui’s cryptanalysis B323|. Over their variants, Knudsen’s 
truncated differentials mm have been shown to be powerful against Massey’s 
Safer block cipher so we investigate it as well. In this paper we consider 
these attacks and we (heuristically) show that CS-Cipher is resistant against it. 
For this we use the well known active Sboxes counting arguments techniques. 

Here we first recall what can be formally proven under the intuitive ap- 
proximation that all round keys are uniformly distributed and independent for 

^ While this paper was presented, the owner of the CS-Cipher algorithm announced 
a “Challenge CS-Cipher” : a 10000 euros award will be given to the first person who 
will decrypt a message encrypted with a key which has been purposely limited to 56 
bits. This is basically an exhaustive search race. See http://www.cie-signaux.fr/. 



L. Knudsen (Ed.): FSE’99, LNCS 1636, pp. 260-^^^ 1999. 
@ Springer- Verlag Berlin Heidelberg 1999 
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difTerential and linear cryptanalysis. We contribute to a new similar analysis 
of truncated differential cryptanalysis. Then we apply these techniques to CS- 
Cipher. In particular we show how to count the minimal number of active Sboxes 
in a computational network with multipermutations by using some easy graph 
algorithms. 



1 Previous Work 



Public research on cryptography arose in the late 70s. On block ciphers, research 
has been paradoxically motivated by the Data Encryption Standard [P contro- 
versial: the fact that the design rationales of DES was kept secret by the US 
government. 

Originally the research community was focusing on the fascinating nonlinear 
properties of the DES Sboxes (and on the existence of a mythical hidden trap- 
door). Nonlinear criterion has been investigated, and the possibility on how to 
achieve it (see Adams and Tavares Nyberg 123). 

Differential cryptanalysis has been invented by Biham and Shamir in the 
90s Pj, and a connection between the security against it and the nonlinea- 
rity of the Sboxes has been found (see Nyberg |2H|)- Later on the same link 
arose with Matsui’s linear cryptanalysis |^!1^ (see Nyberg !2H| and Chabaud- 
Vaudenay |3|). 

Since then an important effort has been done in order to study the security 
of block ciphers against differential and linear cryptanalysis. 

Lai and Massey first invented the notion of “Markov cipher” which enables 
to make a formal treatment on the security against differential cryptanalysis (see 
EiEni). This enables to formally prove some heuristic approximations used in 
this attack. 

It was well known that the resistance against differential cryptanalysis de- 
pends on the minimal number of “active Sboxes” in a characteristic. (The bulk 
of Biham and Shamir’s attack against DES is to find a characteristic with a 
number of active Sboxes as small as possible.) 

In Heys and Tavares [11 fill 7j is defined the notion of “diffusion order” which 
enables to get lower bound on the number of active Sboxes in a substitution- 
permutation network. Similarly, Daemen HH talks about “branch number” . The 
question was also addressed by Youssef, Mister and Tavares m- These notions 
can be used together with the diffusion properties of the network. Namely, when 
we use “multipermutations” (see 1 we can compute these numbers. In 

particular, inspired by the notion of multipermutation, Daemen, Knudsen and 
Rijmen use an MDS code in the Square cipher m which has been used for two 
AES candidates: Rijndael and Crypton 

An alternate way to prove the security against differential and linear cryp- 
tanalysis is to use the Theorem of Nyberg and Knudsen fitl ll.'i I ) (or a variant), 
which has been done by Matsui in the Misty cipher |2M26j . We can also use the 
“decorrelation theory” which has been used in order to create 

the Peanut and Coconut cipher families (see 1 1 Ol.'iYj 1 and the DEC cipher HH] 
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which is an AES candidate |2] (see ISEE!)- Both approachs provide provable 
security against differential and linear cryptanalysis (and not only a heuristic 
security) . 



2 Formal Treatment on Markov Ciphers 

In this section we consider an r-round cipher 

Enc(a;) = {pr o . . . o p\){x) 

in which each pi round uses a subkey ki . We assume that this is a Markov cipher 
with respect to the XOR addition law, which means, following Lai !2D1 that for 
any round i and any x, a and 6, we have 

Pr[pi(x © a) © pi{x) = b]= VY^[pi{X © o) © Pi{X) = h] 
where X is uniformly distributed. (Here © denotes the bitwise XOR operation.) 



2.1 Preliminaries 

For any p-to-q-bit function /, any p-bit a and any g-bit b, let us denote 



Dp/(a,6)=Pr[/(A©a)©/(A) = 6] (1) 

Lp/(a,5)= (2Pr[a.A = 6-/(X)]-l)'. (2) 

(Here a ■ x denotes the dot product of a and x: the sum modulo 2 of all atXi.) It 
is well known that we have 

LP^ (o, b) = 2~P ^(-l)(“M+(b-y)Dp/ (x, y) (3) 

DP^(o, b) = 2-« ^(-l)(“-")+(^-2^)LP-^(a;, y). (4) 

x,y 



For any random function F (or equivalently any function / which depends 
on a random parameter K) we consider the expected values over the distribution 
of F 



EDP^(a, b) = Ff(DP^(o, b)) (5) 

ELP^(a, b) = Ff(LP^(o, b)). (6) 



Obviously, Equations similar to (0 and (0J hold for EDP and ELP. 
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2.2 Differential Cryptanalysis of Markov Ciphers 

Biham and Shamir’s original differential cryptanalysis (see jSI) is defined by a 
characteristic 

f2 = (uJo,...,OJr) (7) 

and focus on the probabilistic event 

En : = u;i;i = 0,..., t/Mq © M' = wp} 

where Mq and Mq are two different plaintexts and Mi and M' are the image of 
Mq and Mg respectively by 

Pi o . . . o pi. 

We let AMi = M, © M'. 

Differential cryptanalysis uses the Eq event by looking at random pairs 
(MpjMg) such that Z\Mp = wp until Eq occurs. Thus we try n random pairs, 
the success rate is at most the probability that one out of the n pairs makes the 
Eq event occur. This probability is 

V Afo.Mj, J 

Thus the probability of success is less than uPtIEq], The average probability of 
success (over the distribution of the key) is less than 



nEk ( Pr [Eq]] . 

VMo.M' J 

Thus we need a number of trials of p/E{Pt[Eq]) in order to achieve an average 
probability of success at least p. 

We define the following formal product 

r 

pip{n) = l[p>pp^{uj,.uuj,) 

i^l 

(which depends on the key) and 

r 

EDP(C) = PeDP^^(w,_i,o;,) 

i=l 

(which does not). We have the following result which is fairly similar to the 
treatment of Lai, Massey and Murphy [ 2 I|I 1 

^ The difference is that these authors assume the “principle of stochastic equivalence” 
which enables to remove the expectation over the distribution of the key. The proof 
is exactly the same. 
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Lemma 1. //Enc is a Markov cipher and if the round keys k\^. . . ,kr are uni- 
formly distributed and independent, we have 



Ek( Pr [En]) =EBP{n). 

Vmo.M' J 

Proof. Since Enc is a Markov cipher and that the keys are independent, from 
Lai POI (see also m) we know that AMq, . . . , AMr is a Markov chain. Thus we 
have 



Ek 



Pr [En 

Mo, Ml, 



n Pr [AMi = unjAMi-x = . . , ZiMo = wo] 



n Pr [AMi = Wi/AMi-i = Wi- 

Mo,M'k 






i=l 

EDP(f2). 



□ 



We thus have the following theorem. 

Theorem 2. Given a Markov cipher Enc = pr o ... o pi for the XOR addition 
law which uses r independent round keys and any differential characteristic f2 = 
( wq , . . . , u>r), in order to achieve an average probability of success greater than p 
for a differential cryptanalysis we need a minimum number of trials of at least 
p(EDP(l7)) This holds in the model were the probability of success of the 
differential cryptanalysis for a fixed key is given by Equation m- 

We emphasis that this is a real formal theorem which does not relies on unproven 
assumptions. 

2.3 Truncated Differential Cryptanalysis 

In Biham and Shamir’s original differential cryptanalysis, we have a given input 
difference loq and we expect a given output difference Ur when the computation 
follows a path of differences wi, . . . , w^-i. It is sometimes useful to consider a 
multi-path with same coq and LOr. Actually, we have 

DP'^“(u;o,a;,)= ^ Pr (9) 

Mo, Ml 

UJi l 

Thus, from Lemma D we obtain that 

r 

EDP^“(wo,w.) = ^ J]^EDP'’^(w,_i,w,). 

UJi — l i — 1 



( 10 ) 
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In the original differential cryptanalysis we only consider the overwhelming term 
of this sum. 

An alternative way is to consider a sub-sum of characteristics which corre- 
spond to the same pattern. For instance, Knudsen’s truncated differentials HE! 
corresponds to the sum of all characteristics which predict some part of the 
differences, i.t. for which 



Vj ^ li (^iOi)j — ^i,j- 

In most of block ciphers, what makes the probability of a characteristic small 
is the differences dij of zero. We thus focus on the propagation of zero differen- 
ces. (Actually, the attack on Safer by Knudsen and Berson uni uses truncated 
differentials with zeroes.) Let us denote 

Supp(o;^) = {j; ^ 0}. 

If we focus on characteristics in which 

Vi Supp(o;i) = li and uii € Ai 

we can get a maximal probability with largest Ai sets. The multi-path sum is 
thus defined by f2 = (/q, . . . , A) and we consider the event 

Ea = {Supp(AMi) = /- i = 1, . . . , r/Supp(Z\Mo) = Iq} 

in which 17 = (Jq, . . . , A)- We call 17 a “support characteristic”. 

Theorem 3. Given a Markov cipher Enc = pr o . . . o pi for the XOR addition 
law which uses r independent round keys we consider a truncated differential 
cryptanalysis. We heuristically assume that there is an overwhelming support 
characteristic 17 = (/q, . . . , A) which is such that the probability of success of the 
differential cryptanalysis for a fixed key is given by Equation 0). The complexity 
of the attacks must be greater than 



P IT [Supp(AMj) = 7j/Supp(Z\Mi_i) = A-i] 

\r=i / 

in order to get an average probability of success greater thanp, with the notations 
of Section\^M 



2.4 Linear Cryptanalysis 

Linear cryptanalysis is fairly similar to differential cryptanalysis. Here we con- 
sider a characteristic 17 associated with a set of linear approximations 

(wj • Mj) © (wi+i • Mi+i) ai ■ A- 
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The characteristic is not associated with a particular event, but corresponds to 
an (assumed) overwhelming term in the multi-path sum. We define the following 
formal product 

r 

LP(f?) = 

(which depends on the key) and 



ELP(f?) = PeLP^^(w,_i,o;,). 



We have the following result. 

Lemma 4. If Enc is a Markov cipher and if the round keys ki, . . . ,k^ are in- 
dependent, we have 

ELP’^“(wo,w) = ELP(wo,...,Wr)- 

Proof. First, by Equations m and (j1 0(1 we have 

r 

ELP^“(wo,Wr) = 2"^ 

Uq ^ X — X 

where I is the bit-length of the plaintext. If we now use Equation after a 
few formal computation steps we obtain the result. □ 

Matsui’s original linear cryptanalysis assumes that one characteristic in the 
sum is overwhelming, and the attack has a heuristic complexity equal to the 
inverse of ELP^"'^(a;o) ^^r)- We can thus get a (heuristic) complexity lower bound 
by upper bounding ELP(17). 

3 On the Security of CS-Cipher 

3.1 Presentation of CS-Cipher 

In this paper we use non standard notations for CS-Cipher which are better 
adapted for our treatment. We recall that the secret key is first transformed into 
a 64-bit subkey sequence kP , . . . , fc®. We also use two 64-bit constants c and c'. 
We let fco , • • ■ , ^24 denote the sequence 

fc°, c, c' , k} ,c,c' , fc^, c, (f , fc®. 

We thus consider a modified CS-Cipher which is denoted CSC* which is defined 
by a 1600-bit random key k = (fcp, . . . , ^ 24 ) with a uniform distribution. 

Let us denote 



Si(a:) = x®ki 
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which is thus used to “randomize” the message block with a subkey. 

We split the standard mixing function M into one linear transformation fj, 
and two involutions P. The linear mapping /r takes two 8-bit inputs and produces 
two 8-bit outputs by 



fi{a, b) = {(p{a) © b, Ri{a) © b). 

Here R[ is a circular rotation by one position to the left, and (p is the standard 
CS-Cipher operation defined by 

ip{x) = (Ri{x) A 55) © X 

where A denotes the bitwise AND operation and 55 is the 8-bit hexadecimal 
constant 01010101. For convenience we let the linear mapping which takes 
eight 8-bit inputs and produces eight 8-bit outputs by four parallel p operations 

p‘^{xi, ...,xs) = {p{xi,X 2 ), . . . , /r(a;7, Xg)). 

We let P denotes the standard CS-Cipher involution defined by a table look- 
up, and P® the application of eight parallel P computations: 

P®(xi, . . . , xs) = (P(xi), . . . , P(xs)). 

We know let denote the following permutation 

T^(xi, . . . , Xg) = (xi, X3, X5, X7, X2, X4, Xg, Xg). 

We let 

Pi = Lt, O P^ O O S^-l 

for 7 = 1, . . . , 23 and 

P2i = S24 O L,r O P® O O S23 

One CS-Cipher block encryption is defined by 

Enc = Pr o ... o pi. 

This way we can consider CSC* of being a 24-round cipher in which each round 
consists of one subkey offset, the linear mixing function, the P® confusion 
boxes and the permutation. Due to the Si structure, it is obvious that CSC* 
is a Markov cipher. 

3.2 Differential Cryptanalysis 

We consider a differential characteristic f2 = (wq, . ■ . ,^24) and we aim to upper 
bound EDP(17). 

We compute each term of the product in EDP(C). Since Si, and are 
linear, we let 

(11) 



6i = 
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and 

( 12 ) 

5i and <5- are the input and output differences of the zth P® layer respectively. 
From the linearity of and and the parallelism of P® we obtain 

24 8 

EDP(f2) = n n (13) 

i=li=l 

Since P is a permutation, the following assumption is necessary for having 
EDP(17) + 0 

Vi Supp(5i) = Supp((5'). (14) 

We say that a P-box corresponding to indices i,j is “active” if {Si)j yf 0. We 
use the following definition. 

Definition 5. For any differential characteristic fi = ( ojq , . . . , uj 2 a ), we define 
the SiS and S's by Equations m and m respectively. We say that Q is “con- 
sistent” if the property of Equation 0 holds. Let fff2 denotes the number of 
indices i,j such that {Si)j 0. 

We thus have the following result. 

Lemma 6. Eor any non-zero differential characteristic f2 we have 

EDP(12) < (DPlx)^"" 

where DP^g^,^ = 2~^ and EDP(f2) = Q if fl is not consistent. 

Proof. We start from Equation (EJ. If 17 is not consistent, we obviously have 
EDP(17) = 0 since P is a permutation. For non active P-boxes, the probability 
is obviously 1. For active P-boxes (there are #17 of it), we upper bound the 
probability by 

DPLx = maxDP^(a,6) 

a^0,b 

which is equal to 2~‘^ for CS-Cipher by construction. □ 

We thus need to lower bound #17 for consistent differential characteristics. This 
paradigm is already well known in the literature: in order to protect against heu- 
ristic differential attacks, we need to make sure that all consistent characteristic 
have a large number of active nonlinear boxes. Actually, the original papers of 
Biham and Shamir focus on looking for differential characteristics with a minimal 
number of S'-boxes (see jZ]). 

Thanks to the multipermutation property of p, it is fairly easy to investigate 
the minimal number of active P-boxes in CS-Cipher. Actually, p has the property 
that 

1. p, is a permutation, 

2. for all a, both outputs of p{a,y) are permutations of y, 
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3. for all 6, both outputs of /i(a;, b) are permutations of x. 

Thus, if exactly one input of /i is non-zero, then both outputs of fi are non-zero. 
If the two inputs of /r are non-zero, then at least one output of is non-zero. 
In other terms, the difference patterns around one ^ box can only be one out of 
the six following patterns. 

00 — 00 0 * — > * * *0 — > * * ** — >• 0 * *0 **—>•**. 



(Stars mean any non-zero differences.) With the notations of the previous section, 
we recall that 

Moreover we consider consistent characteristics, which means that {5[)j is non- 
zero if and only if (J,),- is non-zero. This enables to make rules for “non-zeroness” 
of the 

Actually, we consider 8-bit vectors li = Supp((5i). From the previous argu- 
ments we can make a list of possible A — >■ transitions. (In total we have 

6“^ = 1296 rules.) To each possible li we associate its Hamming weight #7^. We 
can now make the graph of all possible As weighted by #A and in which each 
edge corresponds to a rule. Since #A is also equal to the number of non-zero 
entries in Wi, finding out the minimal number of active P-boxes in a consistent 
differential characteristic corresponds to finding a path of length 24 edges with 
minimal non-zero weight in this graph, which is fairly easy, for instance by using 
the Floyd- Warshall algorithm HH (see [H3, pp. 558-565]). Its complexity is es- 
sentially cubic in time and quadratic in memory (in term of number of vertices, 
which is 256 here) . Experiment shows that such a path has at least a total weight 
of 72. More precisely, the shortest non-zero weight for paths of given edge-length 
is given by the table below. 



w 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
1 3 5 9 13 18 20 24 26 30 32 36 38 42 44 48 50 54 56 60 62 66 68 72 



Thus, we obtain that for any differential characteristic f? we have 



EDP(12) < 2-^®®. 



This makes CSC* provably resistant against the original differential cryptana- 
lysis. Actually, six rounds of CSC* leads to an upper bound of 2“’’^, which is 
already enough. This corresponds to two rounds of CS-Cipher instead of eight, 
so this suggests that four rounds of CS-Cipher are already secure against 2R 
differential attacks. 



3.3 Linear Cryptanalysis 

Linear cryptanalysis has a very similar treatment as was shown by Biham jS|. 
Actually, we have 

oji-i ■ Mi_i = (*(/r^)“^(wi_i) • (/x^ o Si_i)(Mi_i)) © (wi_i • ki-i) 
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and 

• Mi = *L^(o;i+i) • (ijr) ^{Mi) 

thus we let 

5, = ‘(/)-i(o;,_i) (15) 

and 

sr = *K{ui). (16) 

As for differential cryptanalysis, we have the following result. 

Definition 7. For any linear characteristic f2 = {loq, . . . , W 24 ), we define the Sis 
and S's by Equations (ESP and ilti) respectively. We say that fi is “consistent” 
if the property of Equation holds. Let ffL2 denotes the number of indices i,j 
such that { 6 i)j 0. 

We thus have the following result. 

Lemma 8. Eor any non-zero linear characteristic f2 we have 

ELP(12) < 

where = 2“^ and ELP(17) = 0 if L2 is not consistent. 

Here the relation between the i5iS and the <5's is 

su = \pi^oL^m 

which is equivalent to 

5, = iCp-^roLM-i) 

instead of 

Si = o LM-i) 

as for differential cryptanalysis. Obviously, * has the same multipermutation 
property than /r, thus the “non-zeroness” rules for the SiS and (5's are the same. 
We thus obtain that 

ELP(f2) < 2-2®®. 

This makes CSC* heuristically resistant against the original linear cryptanalysis. 

3.4 Support Characteristics 

Here we aim to upper bound the probabilities of the support characteristics for 
CSC* . One problem is that the propagation of non-zero differences through the 
P-boxes has no unusual cases. For this we concentrates on unusual propagations 
through the p. boxes. 

Here the characteristic f2 = (Iq, . . . , I 24 ) defines exactly which inputs and 
outputs of the p-hoxes are non-zero. The probability of the characteristic is non- 
zero only if the number of non-zero input-output of any p-hox is in {0,3,4}. 
With these rules we can make the graph of all possible li — >■ L+i transitions. 
The problem is to weight it. 
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The more interesting probabilities correspond to the case where two inputs 
of a ^-box are non-zero, and one output is zero. Let us denote fii and ^2 the 
two outputs. We thus consider the two probabilities 

© Y') = 0/X ^X',Y^ Y'] 

for j = 1 and j = 2. Due to the linearity of /i, this is equal to 
PT[fi,{X,Y) = 0/X^0,Y^0] 

Ji . , 1 

which is from the multipermutation properties of /i. 

One problem is the vertex I = 8}, because all transitions towards it 

have weight 0. Intuitively, if we go through this vertex, we loose all information, 
and the final probability is actually meaningless, because it is smaller than the 
probability of the same external characteristic for a truly random cipher. For 
instance, if we have the support characteristic = (/,..., I 24 ) in which li = 
{!,..., 8}, we obtain 



, . Pr < Pr[Supp(X) = J 24 ] 

so the “signal” of the support characteristic will vanish against the “noise” of 
natural behavior. Thus we remove this vertex from the graph. 

We can now weight the edges of the graph by the total number of 2-1 transi- 
tions in the four /i-boxes: each — >■ edge defines four transitions though a 

/x-box, so we can count the ones with no zero-input and one zero-output. Then 
we can look for the path with length 24 and minimal weight. The experiment 
shows that the minimal weight is 22. Actually, for any length £ >2, the minimal 
weight is ^ — 2 is obtained, for instance, by iterating the path 

{1,3}^{1,2,5,6}^{!,3} 

which has weight 2 (thus probability 2“^®). For instance, the path 

{1,3} ^ 11,2,5,6} ^ {1,3} ^ ... ^ {1, 2,5,6} ^ {1,3, 6 , 8 } 
of even length £ has weight £ — 2. 

Thus the probability of a support characteristic on 24 rounds is less than 
( 2 - 8)22 _ 2 “!'^®. Actually ten rounds of CSC* leads to an upper bound of 
2“®4. This corresponds to 3.33 rounds of CS-Cipher. So we believe that these 
properties make 5.33 rounds of CS-Cipher heuristically resistant against any 
multi-path differential characteristics. Eight rounds is therefore a comfortable 
safety margin. 

One open problem is the resistance against the recent impossible differen- 
tial cryptanalysis. In this paper we investigated differential characteristics with 
overwhelming behavior. The question now is how to address characteristics with 
unexpected low probabilities. 
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4 Conclusion 

We have shown that CSC* admits no differential or linear characteristic with 
average probability greater than and no support characteristic with an 

average probability greater than We believe that these results hold for 

CS-Cipher as well, which makes it heuristically secure against differential, li- 
near, truncated, and other related differential cryptanalysis. The question on 
the impossible differentials issue remains open though, as well as more general 
attacks. 

Whereas ciphers similar than CS-Cipher use linear diffusion layers for mi- 
xing all pieces of a message in each round (for instance the four AES candidates 
Safer-|-, Serpent, Rijndael, Crypton), CS-Cipher uses a nonlinear diffusion primi- 
tive: the /r operation which is mixed with two non linear P-boxes. This enables 
to achieve a stronger design at a minimal cost (both /i and P have quite efficient 
implementations). It also illustrates that we can use general multipermutations 
and not only MDS codes: large linear layers are nice presents for the attacker. 
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Abstract. This paper presents an efficient interpolation attack using a 
computer algebra system. The interpolation attack proposed by Jakob- 
sen and Knudsen was shown to be effective for attacking ciphers that 
use simple algebraic functions. However, there was a problem that the 
complexity and the number of pairs of plaintexts and ciphertexts requi- 
red for the attack can be overestimated. We solve this problem by first, 
finding the actual number of coefficients in the polynomial (or rational 
expression) used in the attack by using a computer algebra system, and 
second, by finding the polynomial (or rational expression) with fewest co- 
efficients by choosing the plaintexts. We apply this interpolation attack 
to the block cipher SNAKE proposed by Lee and Cha at JW-ISC’97. In 
the SNAKE family there are two types of Eeistel ciphers, SNAKE(l) and 
SNAKE(2), with different round functions. Both of them use the inverse 
function in Galois Field GF(2"*) as S-box. We show that when the block 
size is 64 bits and m = 8, all round keys are recovered for SNAKE(l) 
and SNAKE(2) with up to 11 rounds. Moreover, when the block size is 
128 bits and m = 16, all round keys are recovered for SNAKE(l) with 
up to 15 rounds and SNAKE(2) with up to 16 rounds. 

1 Introduction 

Since two powerful cryptanalyses on block ciphers, differential cryptanalysis Q 
and linear cryptanalysis^, were presented, some new block ciphers with pro- 
vable security against these cryptanalyses have been proposed. On the other 
hand, Jakobsen and Knudsen raised the alarm that some of them are easy to 
cryptanalyze by algebraic attacks such as higher order differential attack and 
interpolation attack^. These attacks are effective for attacking ciphers that use 
simple algebraic functions. For example, there is the 6-round prototype Feistel 
cipher presented in |S|, called the cipher JCN . It uses the cubing function in 

* Most part of this work was done while the authors were with TAG, or Telecommu- 
nications Advancement Organization of Japan. 
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Gp( 233) in the round function, and was broken by the higher order differential 
attack exploiting the low degree of polynomial expression over GF(2). A variant 
of the cipher, called the cipher VUTZS, which uses the cubing function in GF(2^^) 
as the round function, was broken by the interpolation attack up to 32 rounds 
by exploiting the low degree of polynomial expression over GF(2^^). Moreover, 
a slightly modified version of the cipher SHARK |S|, which uses the inverse fun- 
ction in GF(2®) as S-box, was broken up to 5 rounds by an interpolation attack 
exploiting the low degree of rational expression over GF(2®). 

The principle of the interpolation attack is that, roughly speaking, if the 
ciphertext is represented as a polynomial or rational expression of the plaintext 
with N coefficients, the polynomial or rational expression can be constructed 
using N pairs of plaintexts and ciphertexts. Since N determines the complexity 
and the number of pairs required for the attack, it is important to find as small 
N as possible. 

This paper shows two solutions to find a tighter upper bound of N . The first 
problem is that generally it is difficult to find the actual number of coefficients 
in the polynomial or rational expression. In |Sj Jakobsen and Knudsen estimated 
it from the degree of the polynomial or rational expression. However, this me- 
thod often overestimates it when we use a multivariate polynomial or rational 
expression, in particular. As the solution to this problem, we compute the actual 
polynomial or rational expression by using a computer algebra system and find 
the number of coefficients. The second problem is the number of coefficients (or 
the degree) of the polynomial or rational expression varies with the plaintexts 
chosen. If we use a computer algebra system, it is easy to compute the number 
of coefficients of the polynomial or rational expression in a few variables. We 
can find the polynomial or rational expression with the fewest coefficients by 
choosing the plaintexts. 

We apply this interpolation attack to the block cipher SNAKE proposed by 
Lee and Gha at JW-ISG’97|3. This cipher is not a prototype cipher and we 
don’t modify it to simplify the cryptanalysis. The cipher SNAKE is a Feistel 
cipher with provable resistance against differential and linear cryptanalysis. In 
0, it is also claimed that SNAKE is resistant against higher order differential 
attack and interpolation attack, though the rationale was not discussed enough. 
SNAKE(l) and SNAKE(2) have different round functions. To put it concretely, 
the structure of the round function is the same, the substitution-permutation 
network (SPN). Both of them use the same number of S-boxes in the round 
function, the function used as the S-box is the same, e.g., the inverse function 
in GF(2™), but only the diffusion layer is different. 

We apply to the cipher SNAKE the interpolation attack using rational ex- 
pressions. If we represent the cipher SNAKE as a polynomial, the attack becomes 
impractical with only a few rounds, since the number of coefficients in the po- 
lynomial increases to the upper bound of the number of pairs we can obtain. 
This is because the degree of the inverse function in GF(2"’') in polynomial ex- 
pression is very high as follows: f{x) = x~^ = x'^ Using a computer algebra 
system, we find the rational expression with the fewest coefficients by choosing 
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the plaintexts. As a result, both of the SNAKE ciphers with many rounds are 
broken. When the block size is 64 bit and m = 8, all round keys are recovered 
for SNAKE(l) and SNAKE(2) with up to 11 rounds. Moreover, when the block 
size is 128 bit and m = 16, all round keys are recovered for SNAKE(l) with up 
to 15 rounds and SNAKE(2) with up to 16 rounds. 

This paper is organized as follows. In Section 2, we give a summary of the 
interpolation attack. Section 3 describes the specifications of the cipher SNAKE. 
In Sections 4 and 5, we apply the interpolation attack to the cipher SNAKE 
with blocksize 64 bits and 128 bits, respectively. In Section 6, we discuss some 
problems and make concluding remarks. 

2 The Interpolation Attack 

In this section, we describe the outline of the interpolation attack proposed by 
Jakobsen and Knudsen in |2| and explain the notations used in this paper. The 
target of the attack is an iterated cipher with block size 2n bits and R rounds. 
We denote a plaintext by x and a ciphertext by y. Let x be the concatenation 
of u subblocks Xi G GF(2"‘), where 2n = m x u. We define y similarly. 

X = (x„, xu-i, ...,xi)G GF(2™)“, a;, e GF(2™) 

2/ = (y„,y,_i,...,2/i) € GF(2“)“, y, G GF(2-) 

Moreover, we denote the f-th round key by and let the length of be I 
bits. Similarly let be the concatenation of t subblocks where I = mxt. 

fc(d = . . . , kf) G GF(2™)‘ kf G GF(2™) 

2.1 Global Deduction 

If the key is fixed to fc, a ciphertext subblock yj G GF(2"*) can be expressed as 
a polynomial in plaintext subblocks {x„, Xu-i , . . . , xi} as follows: 

Vj ~ fjki^uf • 7 ^i) ^ GF(2 ) [x^, x^_i , . . . , Xi] , 

where GF (2’”)[Ai] is the polynomial ring ofX = {x„,...,xi} over GF ( 2™ ) . If the 
number of coefficients in /_,-fc(x„, x„_i, . . . ,xi) is IV, the attacker can construct 
fjk{xu, Xu-i , . . . , xi) from different N pairs of plaintexts and ciphertexts. If we 
define deg^,, fjk as the degree of fjk{xu, Xu-i, ■ ■ ■ ,x\) with respect to Xi, N is 
estimated as follows. 

{deg^J,k + ^) (1) 

l<i<u 

Note that N can be overestimated when u is large and the polynomial is sparse. 

Once the attacker constructs /jfe(x„, x„_i, . . . , xi), (s)he can encrypt any 
plaintext into the corresponding ciphertext for key k, without knowing the key. 
This attack is called global deduction by Knudsen [41,1 j . Similarly, by swapping ci- 
phertexts and plaintexts, once the attacker can construct Xi = J/u-ii • ■ ■ ,2/i) 

G GF(2™)[y„, y„_i, . . . , yi], (s)he can decrypt any ciphertext into the correspon- 
ding plaintext for key fc, without knowing the key. 
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2.2 Instance Deduction 

If some subblocks of plaintexts are fixed to some values as e.g., x = (0, . . . , 0, Xi), 
a ciphertext subblock yj G GF(2’") can be expressed as a polynomial as follows: 



yj=fjk{xi)GGF{2^)[xi]. 

In this case, fjk{x\) is a polynomial in one variable x\. Generally, there are fewer 
coefficients in fjk(x\) than in global deduction. Therefore, the attacker can con- 
struct fjk(xi) from fewer chosen plaintexts and ciphertexts. Let N be the number 
of coefficient and let deg^,^ fjk = d, and N is estimated as iV < d -I- 1. Once the 
attacker can construct fjk{xi) from N pairs of plaintexts and ciphertexts, (s)he 
can encrypt a subset of all plaintexts, e.g., a: = (0, . . . , 0, Xi), ~^xi G GF(2'"),into 
the corresponding ciphertexts for key k, without knowing the key. This attack 
is called instance deduction by Knudsen |4ldj . Similarly, by swapping ciphertexts 
and plaintexts, the attack where a subset of all ciphertexts are decrypted into 
the corresponding plaintexts is possible. 



2.3 Key Recovery 

The attacker recovers the last round key as follows. We denote the output of the 
(i? — l)-th round by y = (y„,y„_i, . . . ,yi) G GF(2™)“. A ciphertext subblock 
yj G GF(2"*) can be expressed as a polynomial in {x„, Xu-i , . ■ . , xi} as follows: 

Vj = f{xu,Xu-i, ...,xi)g GF(2™)[a;u,a;u-i, • . .,xi]. 

Let N' be the number of coefficients in /(x„, . . . , X\). On the other hand, 

jjj can be also expressed using the ciphertext y and the last round key 
Therefore, if N' pairs of plaintexts x and ciphertexts y are available, the attacker 
can construct /(a;„, . . . , xi) using yj which is computed using y and a 

guessed k^^\ 

f(xu, Xu-i, . . . ,xi) = yj{y,k^^^) (2) 

If Eq. (Q holds for another plaintext/ciphertext pair, the guessed is correct 
with high probability. From the procedure above, the last round key is recovered 
from any -|- 1 pairs of plaintexts and ciphertexts. The average of the required 
complexity for recovering the last round key is {N' + 1)2* where I' is the 
number of last round key bits effective in Eq. (|2D . Repeating similar procedures, 
the attacker can find all round keys. 

In the above, we showed only how to recover the last round key in the case 
of the global deduction attack, or known plaintext attack. Similarly the instance 
deduction attack, or chosen plaintext attack is also possible. In the instance 
deduction attack, the attacker can only use chosen plaintexts and ciphertexts, 
but fewer pairs are required than in the global deduction attack. 
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2.4 Meet-in-the-Middle Approach 

The meet-in-the-middle approach in the interpolation attack was introduced by 
Jakobsen and Knudsen^], which is effective for some attacks on block ciphers. 

We denote the output of a certain internal round hy z = {zu, Zu-i, . . . , z\) G 
GF(2’”)“. A subblock of z, Zj G GF(2™) can be expressed as a polynomial in 
{xu,Xu-i, ■ ■ ■ ,3^1} as follows: 

Zj f (^Xu 5 Xu — I , . . . , Xl ) G GF(2 ) \Xii ; Xu—l T • • • T Xl^ 

On the other hand, Zj can be also expressed as a polynomial in iju-i, ■ • ■ , 
yi} as follows: 



Zj — 9{Vuj Vu—ii ■ ■ ■ iVi) C GF(2 ) \ym Tju—ii • ■ • > yi] ■ 

Note that ijj can be computed from the ciphertext y and a guessed 
Therefore, Eq. m is constructed by guessing 

f{Xu, Xu-i, ...,Xi)= g{iju, iju-i, ■ • • , yi) (3) 

The number of pairs required for constructing Eq. (0) is computed as follows. If 
/ and g are represented as polynomials, the required number of pairs is 

(# of coefs. in /) + (# of coefs. in g). 

If / and g are represented as rational expressions f = jq and ff = where 
/2 0 and 52 7^ 0, it is 

((# of coefs. in /i) — 1) x (# of coefs. in 52) 

+ (# of coefs. in /2) x ((# of coefs. in gi) — 1). 

Note that we subtract I’s because we can fix one of the coefficients in the rational 
expression to a certain value, e.g., 1. The attacker can judge whether is 
correct or not by examining if Eq. (j3D holds for another plaintext/ciphertext 
pair. 

3 SNAKE 

The cipher SNAKE is a Feistel cipher, which has two types, SNAKE(l) and 
SNAKE(2), with different round functions. The general form is SNAKE(i)(m, s, 
w, r), where 

i - (= 1 or 2) the type of SNAKE described in the below, 
m - the size of input and output of the S-box in bit, 
s - the number of S-boxes used in the round function, 
w - block size in bit (u> = 2sm), 
r - the number of rounds. 
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Fig. 1. Round function of SNAKE(l) 



0- — round key 




Fig. 2. Round function of SNAKE(2) 



Figures Q and 0 show the round functions of SNAKE(l) and SNAKE(2) for 
s = 4, which were demonstrated in For the S-box in the round function, the 
inverse function S{x) = x~^ in GF(2’”) is used, because differential probability 
and linear probability of S{x) are 2^“™ when m is even. In this paper we define 
the S-box as follows, though the output for input 0 is not defined in 

5(3.) = I a;-Mn GF(2™) x^O 

Since SNAKE(z)(8, 4, 64, 16) is given as an example in 0, we apply the interpo- 
lation attack to it in Section 4. Moreover, we also apply the interpolation attack 
to a 128-bit variant, SNAKE(z)(16, 4, 128, 16) in Section 5, since in 0 it is clai- 
med that one of merits of the cipher SNAKE is that its encrypting data block 
length (=block size) is flexible. 

4 Interpolation Attack of SNAKE(z)(8, 4, 64, r) 

4.1 Rational Expression over GF(2®) of SNAKE(i)(8, 4, 64, r) 

In this section, we attack SNAKE(z)(8, 4, 64, r) using the interpolation attack. 
If we represent the cipher SNAKE as a polynomial, the attack becomes imprac- 
tical with only a few rounds, since the number of coefficients in the polynomial 
increases to the upper bound of the number of pairs we can obtain. Therefore, 
we represent SNAKE as a rational expression over GF(2®). 

Let the plaintext block and the ciphertext block be as follows. 

X = {xs, xr,...,xi) e GF(2®)®, Xi G GF(2®) 

y = (yg, 2/7, , 2 / 1 ) G GF{2y, y, G GF(2S) 

We denote the z-th round key by 

/cW = G GF(2®)4, kf G GF(2®). 
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Moreover, we denote the upper half (32 bits) of the output of the r-th round by 



zW = e GF(2«)^ G GF(2S). 



Global deduction. In the global deduction, we represent a ciphertext subblock 
as a rational expression over GF(2®) in {xg, x -;, . . . , a;i}. First of all, we show that 
the round functions of SNAKE(l) and SNAKE(2) can be represented as simple 
rational expressions as follows: 



SNAKE(l) 



Eg = 



E4 = 



El = 









Xa 



^3+ 1 



SNAKE(2) 



E = 



El 



Eg = 



E4 = 



El = 



E1 + X2 
1 



X1 + X2 + E3 
1 

El + X, + E, + E4 



where variables Xi,Yj G GF(2®) are shown in Figures E and 0 

Next, we extend the expressions of the round function to the entire cipher. In 
a Feistel cipher, there are XORs between 32-bit data in each round. These opera- 
tions are regarded as four additions on GF(2®). We’d like to find the rational ex- 

(r) 

pression over GF(2®) of each subblock of the output of the r-th round, Zj = 

where 0) G GF(2®) [xg, . . . , Xi, fcf \ . . . , . . . , . . . , 

We use the computer algebra system Risa/Asir|Zj to compute the rational 
expressions. It usually takes much time and space complexity to find them since 
the number of variables and the degree increases as the number of rounds in- 
creases. However, it is possible to find the rational expressions of the cipher 
SNAKE with only a few rounds. We show the actual numbers of coefficients in 
the rational expressions in Table 0 The number of coefficients we find here is 
very important and useful for evaluating the tighter upper bound of the com- 
plexity and the number of p/c pairs in the key-recovery attack that uses the 
meet-in-the-middle approach (see Section 0. 

For estimating the number of coefficients in the rational expressions of SNAKE 
with more rounds, we use the following two techniques. 

— decrease the number of variables by representing each round key ' as 
K* G GF(2®), i.e., a monomial in k, where k G GF(2®) and i is randomly 
chosen from GF(2®) \ {0, 255}. 

— estimate the upper bound of the number of coefficients using Eq. (0 , since 
it is easy to find the degree of the rational expression with respect to each 
variable. 
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Table 1. ^ of coef. in rational exp. over GF(2®) of SNAKE(l) and SNAKE(2) (global 
deduction) 





SNAKE(l) 


SNAKE(2) 


Jl) Jl) Jl) Jl) 


(12) (6) (3) (24) 

(8) > (4)> (2)> (16) 


(5) (4) (3) (6) 

(4)> (3)> (2)> (5) 


(2) (2) (2) (2) 
^4 5 ? ^2 ’ '^1 


(432) (TOS) (40) (1760) 

(352) ’ (88) ’ (32) ’ (1540) 


785) [34) [T4) (my 

(72) ’ (27) ’ (10) ’ (185) 



Note) (# of coefs. in the numerator)/(^ of coefs. in the denominator) 



We show the degrees of the rational expressions over GF(2®) with respect to 
each variable in Table|3 Since for every subblock of every round € GF(2®), 
the rational expressions of SNAKE(l) and SNAKE(2) are of the same degree 
with respect to each variable, we put them together in one table, Table 

Instance deduction. In the instance deduction, we fix some subblocks of the 
plaintexts x = (xg, X 7 , . . . , Xi) to a certain value. For example, we fix {x-j, . . . ,Xi\ 
to {0, ... ,0}, and represent a ciphertext subblock as a rational expression over 
GF( 28 ) in xg. If the number of coefficients in this rational expression in xg, 
denoted by N, does not exceed 2® = 256, we can construct the rational expression 
using N pairs of chosen plaintexts and ciphertexts. Therefore, it is desirable to 
find chosen p/c pairs such that the required number of pairs is as small as 
possible. 

Chosen plaintexts useful for attacking SNAKE(l). We decided to find the ra- 
tional expression with the fewest coefficients from the rational expressions in 
one variable. The reason for this is as follows. Let a and [3 be the numbers of 
variables. For the rational expression of a subblock , if a > /3, the minimum 
value of the number of coefficients in the rational expression in a variables is 
larger than that in j3 variables. 

It is easy to compute the rational expressions in one variable for all possible 
chosen plaintexts, since there are only 2® combinations. Our experimental results 
show that when we choose plaintexts s.t. (xg, 0, . . . , 0) the number of coefficients 
in the rational expression in xs is the smallest for SNAKE(l). Table |3 shows 
the degrees of the rational expressions over GF(2®) when we choose plaintexts 
s.t. (xg, 0, . . . , 0) and (0, X 7 , 0, . . . , 0), respectively. The figures in brackets are 
the numbers of coefficients in the numerator or denominator polynomials. From 
Table 0 we can see that these rational expressions are dense, or all coefficients 
are nonzero. Note that the degrees with respect to Xs and xr in Table 0 are not 
always equivalent to those in Table 0 though you may conjecture that they are 
equivalent. 

Chosen plaintexts useful for attacking SNAKE(2). For SNAKE(2), if we choose 
plaintexts s.t. (xg, xg, 0 , . . . , 0 ), the degree of the rational expression in Xg falls as 
Table 0shows. This is because the input to the leftmost S-box in the 2-nd round 
function becomes constant. Our experimental results show that the plaintexts 
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Table 2. Degrees of rational exp. over GF(2®) of SNAKE(l) and SNAKE(2) (global 
deduction) 





2(7 


1 , 0 , 0 , 0 , 0 , 1 , 1,1 


0 , 1 , 0 , 0 , 0 , 0 , 1,1 


'^4 




0 , 0 , 0 . 0 , 0 , 1 . 1,1 ’ 


0 , 0 , 0 , 0 , 0 , 0 , 1,1 ’ 






0 , 0 , 1 , 0 , 0 , 0 , 0,1 


0 , 0 , 0 , 1 , 1 , 1 , 1,1 


^2 


^1 


0 , 0 , 0 , 0 , 0 , 0 , 0,1 ’ 


0 , 0 , 0 , 0 , 1 , 1 , 1,1 






0 , 1 , 1 , 1 , 2 , 1 , 2, 3 


0 , 0 , 1 , 1 , 1 , 2 , 1,2 


^4 


^3 


0 , 1 , 1 . 1 , 1 , 1 . 2, 3 ’ 


0 , 0 , 1 , 1 , 1 , 1 , 1,2 ’ 






0 , 0 , 0 , 1 , 1 , 1 , 2,1 


1 , 1 , 1 , 1 , 1 , 2 , 3, 5 


^2 


^1 


0 . 0 , 0 . 1 . 1 . 1 . 1.1 ’ 


1 . 1 . 1 . 1 . 1 . 2 . 3. 4 






2 , 1 , 2 , 3 , 3 , 6 , 7, 9 


1 , 2 , 1 , 2 , 2 , 3 , 6, 7 


^4 


^3 


1 . 1 , 2 , 3 , 3 , 6 , 7, 9 ’ 


1 , 1 , 1 , 2 , 2 , 3 , 6, 7 ’ 






1 , 1 , 2 , 1 , 1 , 2 , 3, 6 


1 , 2 , 3 , 5 , 6 , 7 , 9,12 


^2 


^1 


1 . 1 , 1 . 1 . 1 . 2 , 3. 6 ’ 


1 . 2 . 3 . 4 . 6 . 7 . 9.12 






3 , 6 , 7 , 9 , 11 , 13 , 20,28 


2 , 3 , 6 , 7 , 8 , 11 , 13,20 


'^4 


^3 


3 , 6 , 7 , 9 , 10 , 13 , 20 , 28 ’ 


2 , 3 , 6 , 7 , 8 , 10 , 13,20 ’ 






1 , 2 , 3 , 6 , 7 , 8 , 11,13 


6 , 7 , 9 , 12 , 13 , 20 , 28,39 


'^2 


^1 


1 , 2 , 3 , 6 , 7 , 8 . 10 . 13 ’ 


6 . 7 , 9 . 12 . 13 . 20 . 28.38 






11 , 13 , 20 , 28 , 31 , 45 , 59,81 


8 , 11 , 13 , 20 , 22 , 31 , 45,59 


'^4 


^3 


10 , 13 , 20 , 28 , 31 , 45 , 59,81 ’ 


9 , 10 , 13 , 20 , 22 , 31 , 45,59 ’ 


J5) 


Ab) 


7 , 8 , 11 , 13 , 14 , 22 , 31,45 


13 , 20 , 28 , 39 , 45 , 59 , 81,112 


^2 


^1 


7 , 8 , 10 , 13 , 14 , 22 , 31,45 ’ 


13 , 20 , 28 , 38 , 45 , 59 , 81.112 


757 


Tier 


31 , 45 , 59 , 81 , 92 , 125 , 177,244 


22 , 31 , 45 , 59 , 67 , 92 , 125,177 


"^4 


^3 


31 , 45 , 59 , 81 , 91 , 125 , 177,244 ’ 


22 , 31 , 45 , 59 , 67 , 91 , 125,177 ’ 




Ab) 


14 , 22 , 31 , 45 , 52 , 67 , 92,125 


45 , 59 , 81 , 112 , 125 , 177 , 244,255 


^2 


^1 


14 , 22 , 31 , 45 , 52 , 67 , 91.125 ’ 


45 . 59 . 81 . 112 . 125 . 177 . 244.255 




^( 7 ) 


92 , 125 , 177 , 244 , 255 , 255 , 255,255 


67 , 92 , 125 , 177 , 199 , 255 , 255,255 


'^4 


^3 


91 , 125 , 177 . 244 , 255 , 255 , 255,255 ’ 


67 , 91 , 125 , 177 , 199 , 255 , 255,255 ’ 


^( 7 ) 


A'T) 


52 , 67 , 92 , 125 , 139 , 199 , 255,255 


125 , 177 , 244 , 255 , 255 , 255 , 255,255 


^2 


^1 


52 , 67 , 91 , 125 . 139 . 199 , 255,255 ’ 


125 . 177 . 244 . 255 . 255 . 255 . 255.255 



Note) Let deg^. / be the degree of / with respect to Xi, and the degrees in Table 0 are 
shown as follows. 



deg^s /i^^ 



/n\deg /j[\deg^ 



/ jr \ deg . 



/ jr ’. deg ^ 



/ j [\ deg ^ 



/ jr \ deg ^^ / ji ' 



deg,, /j2^ 



,7 /j^’.deg,, /ja\deg^ 



/j^.deg^^/jj’.deg^ 



/ j ^. deg ^ 






s.t. (xs, a;8, 0 , . . . , 0) bring about the rational expression in one variable of the 
fewest coefficients for SNAKE(2). The plaintexts s.t. (0, X7, 0:7, 0, . . . , 0) bring 
about the second fewest one. 



4.2 Key Recovery 



In this subsection, we demonstrate how to recover the last round key by ta- 
king a simple example of a chosen plaintext attack of SNAKE (2) with 9 rounds, 
i.e. SNAKE(2)(8, 4, 64, 9), (see also Figure 0 in Appendix). If we choose plain- 
texts s.t. X = (xs, xs,0,. . . , 0), the second subblock from the right of the out- 
put of the 7-th round, £ GF(2®), is represented as the rational expression 
where both /i(a:8) and /2(a^s) have 16 coefficients (see Table 0). 
Therefore, if 16 -I- 15 = 31 pairs of plaintexts s.t. x = (xs, Xs,0,. . . , 0) and cor- 
responding ciphertexts are given, we can construct the rational expression. The 
attack equation is as follows. 



fijxs) 

f2(x8) 



ye + S{yi + A:^^) 



( 4 ) 
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Table 3. Degrees of rational exp. over GF(2®) of SNAKE(l). 



when X = {x%, 0, . . . , 0) 



'^4 


Ai) Ai) Ji) 


1 (2) 

0 (1)’ 


0 (1) 

0 (1)’ 


0 (1) 

0 (1)’ 


010 


'^4 


(2) (2) (2) 


0 (1) 

0 (1)’ 


0 (1) 

0 (1)’ 


0 (1) 

0 (1)’ 


0 (1) 

1 (2) 


'^4 


^(3) ^(3) (3) 


2 (3) 

1 (2)’ 


1 (2) 

1 (2 ’ 


1 (2) 

1 (2)’ 


1 (2) 
1 (2) 


'<'4 


A4) (4) (4) 

'^Z ’ '^2 ’ '^1 


3 (4) 

3 (4)’ 


2 (3) 

2 (3)’ 


1 (2) 

1 (2)’ 


5 (6) 

6 (7) 


7 ^ 


.(5) .(5) .(5) 

^3 1 ^2 ’ 


11 (12) 
10 (11) ’ 


8 (9) 

8 (9)’ 


7 (8) 

7 (8)’ 


13 (14) 
13 (14) 


^4 


.(6) .(6) .(6) 
^3 1 ^2 ’ 


31 (32) 
31 (32) ’ 


22 (23) 
22 (23) ’ 


14 (15) 
14 (15) ’ 


44 (45) 

45 (46) 


7 ^ 

^4 


.(7) (7) (7) 

^3 : '^2 ’ 


92 (93) 
91 (92) ’ 


67 (68) 
67 (68) ’ 


52 (53) 
52 (53) ’ 


125 (126) 
125 (126) 



Note) degrees with respect to *8 



when X = (0, 0 : 7 , 0, , 0) 



'^4 


.( 1 ) .( 1 ) 
^3 ^2 


^1 


0 (1) 

0 (1)’ 


1 (2) 

0 (1)’ 


0 (1) 

0 (1)’ 


0 ( 1 ) 
0 ( 1 ) 


7^ 


apt 


Ai2r 


^(2) 


oTT 


0 (1) 


0 (1) 


0 (1) 


Z4 


^3 


^2 




1 ( 2 )’ 


0 (1)’ 


0 (1)’ 


1 (2) 


737 






z^'-^> 


1 ( 2 ) 


2 (3) 


1 (2) 


2J3) 


^4 


^3 


^2 


^1 


1 (2)’ 


1 (2)’ 


1 (2)> 


2 (3) 


737 




z^-*> 




5 (6) 


3 (4) 


2 (3) 


6 (7) 


^4 


^3 


^2 


^1 


6 (7)’ 


3 (4)’ 


2 (3)’ 


7 (8) 


7^ 


..(5) .,(5) 




13 (14) 


11 (12) 


8 (9) 


20 (21) 


^4 


^3 


^2 


^1 


13 (14) ’ 


10 (11) ’ 


8 (9)’ 


20 (21) 


7^ 


7J677J67 




44 (45) 


31 (32) 


22 (23) 


58 (59) 




^3 


^2 


^1 


45 (46) ’ 


31 (32) ’ 


22 (23) ’ 


59 (60) 


7^ 

^4 


J7) jr) 
^3 ^2 


^1 


125 (126) 
125 (126) ’ 


92 (93) 
91 (92) ’ 


67 (68) 
67 (68) ’ 


177 (178) 
177 (178) 



Note) degrees with respect to X 7 



The last round key is recovered as follows. For 31 plaintext/ciphertext pairs, 
we compute the right side of Eq. guessing Thus all the coefficients in 
are determined. If the constructed Eq. (0 holds for the 32-nd pair of 

plaintext and ciphertext, we can judge that the guessed is correct with high 
probability. Since is 8 bits, the average of the required complexity of this 
attack is about 32 x 2® x ^ ~ 2^^ times the computation of an S-box. Note that 
the required complexity for constructing the rational expression is negligible. 

We apply similar key-recovery attacks to SNAKE with several rounds. We 
show some best attacks in Table 0, for which I mean that there is trade-off 
relations between the required number of p/c pairs and the complexity. The 
complexity is measured by the times of computation of the round function. In 
the column of strategy we show the attack strategies using simple symbols. 

For example, the strategy “7* -I- Ifc” means that the key-recovery attack uses 
the instance deduction of 7 -round and recovers the last round key (rn bits) as 
Figure 0 shows. The strategy “7i-|-2A:” means that the key-recovery attack uses 
the instance deduction of 7 -round and recovers the last round key (4m bits) and 
a subblock of the key of the second round from the bottom (m bits). The strategy 
“Hi -\-2g -\- Ifc” means that the key-recovery attack uses the meet-in-the-middle 
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Table 4. Degrees of rational expression over GF(2®) of SNAKE(2) 





when 


X = {xg,xs,0, . . 


.,0) 








lA2] 


1 ( 2 ) 


0 ( 1 ) 


oil] 


^4 




0 ( 1 )’ 


0 ( 1 )’ 


0 {!)’ 


0 ( 1 ) 




^( 2 ) ^( 2 ) ^( 2 ) 


0 ( 1 ) 


0 ( 1 ) 


0 ( 1 ) 


0 ( 1 ) 


>^4 




1 ( 2 )’ 


0 ( 1 )’ 


0 {!)’ 


0 ( 1 ) 




713 ]— 13 ]— I 3 T 


1 ( 2 ) 


1 ( 2 ) 


0 ( 1 ) 


1 ( 2 ) 


^4 


^3 ’ ^2 ’ ^1 


0 ( 1 )’ 


0 ( 1 )’ 


0 ( 1 )’ 


1 ( 2 ) 




A *} M Ji ) 


2 ( 2 ) 


1 ( 2 ) 


1 ( 2 ) 


1 ( 2 ) 


>^4 


^3 ’ ^2 ’ ^1 


3 ( 4 )’ 


1 ( 2 )’ 


1 ( 2 )’ 


1 ( 2 ) 




A 5 ) A ) ( 5 ) 


4 ( 5 ) 


3 ( 4 ) 


1 ( 2 ) 


7 ( 8 ) 


^4 


^3 ’ ^2 ’ ^1 


3 ( 4 )’ 


2 ( 3 )’ 


1 ( 2 )’ 


7 ( 8 ) 


~z^ 


TleFTlerTer 


13 ( 14 ) 


9 ( 10 ) 


8 ( 9 ) 


14 ( 15 ) 




^3 ’ ^2 ’ ^1 


14 ( 15 ) ’ 


9 ( 10 ) > 


8 ( 9 )’ 


14 ( 15 ) 




TiwTWTrn" 


35 ( 36 ) 


25 ( 26 ) 


15 ( 16 ) 


52 ( 53 ] 


^4 


^3 ’ ^2 ’ ^1 


34 ( 35 ) ’ 


24 ( 25 ) ’ 


15 ( 16 ) ’ 


52 ( 53 ) 






105 ( 106 ) 


76 ( 77 ) 


60 ( 61 ) 


139 ( 140 ) 


>^4 


^3 ’ ^2 ’ ^1 


106 ( 107 ) ’ 


76 ( 77 ) ’ 


60 ( 61 ) ’ 


139 ( 140 ) 



Note) degrees with respect to xg 



Table 5. Interpolation attacks of SNAKE(i)(8, 4, 64, r) 



SNAKE(l) 


^rounds(r) 


#pairs 


complexity 


chosen plaintexts 


strategy 


9 


59 


2^7 


(®8,0, .. 


,0) 


6i -1 Ig + Ik 




106 


214 


(*8,0,.. 


,0) 


7i -f Ik 


10 


106 


246 


(*8,0, .. 


,0) 


7i -|- 2k 




211 


239 


(*8,0,.. 


,0) 


7i -1 Ig ->r Ik 


11 


211 


247 


(*8,0, .. 


,0) 


7i “h ~\~ 2/i) 


SNAKE(2) 


^rounds(r) 


7)^:pairs 


complexity 


chosen plaintexts 


strategy 


9 


32 




(*8,*8,0, 


..,0) 


7i -f Ik 


10 


32 


243 


(*8,*8,0, 


..,0) 


7i -1- 2k 




63 


238 


(*8,*8,0, 


..,0) 


7i -1 Ig ->r Ik 




122 


214 


(*8,*8,0, 


..,0) 


8i -f Ik 


11 


122 


246 


(*8,*8,0, 


..,0) 


8i -f 2k 



approach where the instance deduction of 11-round and the global deduction of 
2-round are used, and recovers the last round key (4m bits) as Figure 0] shows. 

5 Interpolation Attack of SNAKE(z)(16, 4, 128, r) 

We apply the interpolation attack to a 128-bit variant, SNAKE(f)(16, 4, 128, r). 
When the block size is 128 bits and the size of input and output of the S-box is 
16 bits, the maximum number of available p/c pairs for the attacker increases 
compared with the case when the block size is 64 bits. Thus, some attacks become 
possible that would be impractical when the block size was 64 bits. 

For example, we demonstrate the interpolation attack of SNAKE(2) with 
15 rounds, i.e. SNAKE(2)(16,4, 128,15) (see also Figure 0| in Appendix). If 
we choose plaintexts s.t. x = (a;8,a;8j0, ...,0), the second subblock from the 
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Table 6. Degrees of rational exp. over GF(2^®) of SNAKE(l) 



when X = {x%, 0, . . . , 0) 



js) 






275 


199 


139 


380 




■^2 5 


^1 


275 ’ 


199 ’ 


139’ 


381 


~(9) ^(9) 




7 ^ 


811 


587 


433 


1119 




^2 ’ 




810’ 


587’ 


433 ’ 


1119 


0 

0 


urroy 


UTToy 


2414 


1751 


1258 


3330 


^4 5 ^3 


^2 




2414 ’ 


1751 ’ 


1258 ’ 


3331 


All) (11) 


Uirry 


UITTT 


7151 


5176 


3764 


9873 


^4 ’ ^3 


*2 


^1 


7150 ’ 


5176 ’ 


3764 ’ 


9873 


U12TUW 




~zuw 


21227 


15388 


11131 


29294 


^4 ’ ^3 


^2 




21227’ 


15388 ’ 


11131 ’ 


29294 


.(13) (13) 

^4 ’ ^3 


^2 






? 


33059 

33059’ 


- 



Note) degrees with respect to xs 



Table 7. Degrees of rational exp. over GF(2^®) of SNAKE(2) 



when X = [xs, ig, 0, . . . , 0) 





7 ^ 


7 ^ 


310 


224 


154 


433 


^4 ’ ^3 


^2 ’ 




309’ 


223 ’ 


154 ’ 


433 


0 

4 

0 


TTW 


“JW 


916 


663 


493 


1258 


^4 ’ ^3 


^2 


’ ^1 


917’ 


663 ’ 


493 ’ 


1258 


.,(11) ,(11) 






2724 


1975 


1412 


3764 


^4 ’ ^3 


^2 


’ ^1 


2723 ’ 


1974 ’ 


1412 ’ 


3764 


.(12) (12) 
^4 ’ ^3 


^2 


’ ^1 


8067 


5839 


4257 


11131 


8068 ’ 


5839 ’ 


4257 ’ 


11131 


^(13) (13) 


Array 


Array 


23951 


17363 


12543 


33059 


^4 5 ^3 


^2 




23950 ’ 


17362 ’ 


12543 ’ 


33059 



Note) degrees with respect to xg, 



right of the output of the 11-th round, G GF(2^®), is represented as the 
rational expression According to Table Q deg^.^ /i = 1412 and 

degjjg /2 = 1412. On the other hand, is also represented using y, which can 
be computed from y and a guessed . Using the meet-in-the- 

middle approach, we have the attack equation as follows. 

fijxs,) ^ gi(y(y,fc(^^))) 
f2{xs) g2{y{y,ki^^'>)) 

From Tabled the numbers of coefficients in gi and g 2 are 14 and 10, respectively. 
The numbers of coefficients in /i and /2 are estimated to be at most 1413. 
Therefore, the required number of p/c pairs for constructing Eq. (0) is at most 
(14— 1) X (1413 — 1) -I- 10 X 1413 ~ 2^® for a guessed k^^^\ Since is 64 bits, 
the average of the required complexity of this attack is at most 2^^ x 2®"^ ^ | 
about 2^® times of computation of the round function. 

If we didn’t know the actual numbers of coefficients in gi and g 2 , we would 
estimate them to be 48 and 32, respectively, from Eq. m and the degrees of 
Z 2 ^ in Tabled Then, the attack would be considered as impossible because the 
required number of pairs exceeds 2^®. 

We applied similar key-recovery attacks to SNAKE with the blocksize of 128 
bits with several rounds. Similarly to Section 4, we show some best attacks in 
Tabled As a practical attack of SNAKE(2)(16, 4, 128, 16), we give the attack 
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Table 8. Degrees of rational expression over GF(2^®) of SNAKE(2) 



when X = (xg, xg, + xj, xr, . . . , 0) 



Jl) Jl) Jl) ^(1) 

^4 ) ^3 ) ^2 5 '^1 


1,0 1,1 0,1 0,0 

0,0’ 0,0’ 0,0’ 0,0 


(2) 12) (2) (2) 
^4 ? ^3 ? ^2 5 '^1 


0,0 0,0 0,0 0,0 

1.0’ 0,1’ 0,0’ 0,0 


A 3 ) (3) (3) (3) 

^4 ) ^3 ) ^2 5 '^1 


1,1 1,1 0,1 1,1 

0,1’ 0,0’ 0,0’ 1,1 


A 4 ) (4) ( 4 ) (4) 

^4 ? ^3 ? ^2 ’ '^1 


2,1 1,2 1,1 1,2 

3.1’ 1.3’ 1.1’ 1.2 


A 5 ) ( 5 ) ( 5 ) ( 5 ) 

^4 ) ^3 ? ^2 5 '^1 


4,7 3,4 1,3 7,8 

3,7’ 2,3’ 1,2’ 7,8 


AQ) AQ) A6) A6) 
^4 ? ^3 ^ ^2 ’ '^1 


13,14 9,13 8,9 14,22 

14.14’ 9.14’ 8.9’ 14.22 


A 7 ) ( 7 ) ( 7 ) A-^) 

^4 ) ^3 ? ^2 5 '^1 


35,52 25,35 15,25 52,67 

34.52’ 24.34’ 15.24’ 52.67 


A 3 ) A 3 ) AS) AS) 
^4 ? ^3 ? ^2 5 '^1 


105,139 76,105 60,76 139,199 

106.139’ 76,106’ 60,76’ 139,199 


A 9 ) „(9) (9) (9) 

^4 ) ^3 ) ^2 5 '^1 


310,433 224,310 154,224 433,587 

309.433’ 223.309’ 154.223’ 433.587 


(10) (10) (10) (10) 
^4 5 ^3 ? ^2 ’ ^1 


916,1258 663,916 493,663 1258,1751 

917.1258’ 663.917’ 493.663’ 1258.1751 


(11) (11) (11) (11) 
^4 ? -^3 ? ^2 ’ ^1 


2724,3764 1975,2724 1412,1975 3764,5176 

2723.3764’ 1974.2723’ 1412.1974’ 3764.5176 


(12) (12) (12) (12) 
^4 5 ^3 ? ^2 ’ ^1 


8067,11131 5839,8067 4257,5839 11131,15388 

8068,11131’ 5839.8068’ 4257,5839’ 11131,15388 


.(13) (13) (13) (13) 

^4 ? -^3 ) ^2 ’ ^1 


23951,33059 17363,23951 12543,17363 33059,45602 

23950,33059’ 17362,23950’ 12543,17362’ 33059,45602 


.(14) .(14) (14) (14) 

^4 5 ^3 ? ^2 ’ ^1 


37316,51441 

’ ’ 37316,51441’ 



Note) degrees with respect to xg and X 7 



which requires < 2^^ p/c pairs and complexity of 2^^ times the computation of 
the round function. It uses the chosen plaintexts s.t. x = (xg, xg+xr, X7, 0, . . . , 0). 
This brings about the rational expression in two variable of the fewest coefficients 
for SNAKE(2). 

6 Discussion and Concluding Remarks 

Division by 0. Here we discuss a problem in the interpolation attack using 
rational expressions, which wasn’t pointed out in |3|. The problem is that we 
can’t always construct correct rational expressions if at least one of the inputs 
of S-boxes are 0 in the encryption process. We can detect this by comparing the 
degree of the constructed rational expression with that in Table El etc. The de- 
gree often gets higher. If this happens, we may construct the rational expression 
again using different pairs of plaintexts and ciphertexts. In this case the requi- 
red number of p/c pairs can be estimated by the cryptanalysis with probabilistic 
non-linear relations shown by Jakobsen EJ. This is based on Sudan’s algorithm 
for decoding Reed-Solomon codes beyond the error-correction diameter. If we 
apply Jakobsen’s result to this problem, when the rational expression holds with 
probability /r, the attack is possible with Af = ^ p/c pairs in time polynomial in 
Af, where N is the number of coefficients in the rational expression. The probabi- 
lity ^ is equal to the probability with which none of the inputs of S-boxes are zero 
in the encryption process. Therefore, fi = for SNAKE(*)(m, 4, 8m, r), 

if we assume every input of the S-box is random and independent. For example. 



288 



S. Moriai, T. Shimoyama, and T. Kaneko 



Table 9. Interpolation attacks of SNAKE(i)(16, 4, 128, r) 



SNAKE(l) 


^rounds(r) 


#pairs 


complexity 


chosen plaintexts 


strategy 


13 




22» 




(*8,0, .. 


,0) 


Hi + Ik 


14 


215 


230 




(a;8,0, .. 


,0) 


12i + Ik 


15 


215 


294 




(a;8,0, .. 


,0) 


12i -1- 2k 




216 


279 




(a;8,0, .. 


,0) 


12i + Ig + Ik 


SNAKE(2) 


^rounds(r) 


#pairs 


complexity 


chosen plaintexts 


strategy 


13 


2“ 


2^4 




(xs,xs,0,. 


..,0) 


lOi + Ig + lk 




212 


227 




{X8,X8,0,. 


..,0) 


Hi -h Ik 


14 


214 


229 




{X8,X8,0,. 


..,0) 


12i -b Ik 


15 


214 


293 




(X8,X8,0,. 


..,0) 


12i -b 2k 




215 


230 




(X8,X8,0,. 


..,0) 


13i -b Ik 


16 


215 


294 




(X8,X8,0,. 


..,0) 


13i -b 2k 




232 


247 




(X8, X8 + X7, *7, 0, ... , 0) 


14i -b Ik 



the attack for SNAKE(f)(8, 4, 64, 11) is possible with Af = ^ 1.411A^ p/c 

pairs in time polynomial in Af, and the attack for SNAKE(2)(16, 4, 128, 16) is 
possible with AT = ~ 1.0021V' p/c pairs in time polynomial in AT. 

Diffusion layer. It is the plain diffusion layer that makes the interpolation attack 
on the cipher SNAKE(2) easier. The diffusion layer of the cipher SNAKE(2) 
keeps the outputs of some subblocks constant for some chosen plaintexts. We 
have to consider this in designing the diffusion layer. 

Concluding remarks. We presented an efficient interpolation attack using a 
computer algebra system. We succeeded in attacking the block cipher SNAKE 
efficiently - with smaller complexity and fewer p/c pairs - 1) by finding the 
actual number of coefficients in the rational expression used in the attack and 
2) by finding the rational expression with the fewest coefficients by choosing the 
plaintexts. We found some attacks feasible which we would consider as impossible 
by the interpolation attack described in 0. When we evaluate the resistance of a 
block cipher to the interpolation attack, it is necessary to apply the interpolation 
attack described in this paper. 

We showed that when the block size is 64 bits and m = 8, all round keys 
are recovered for SNAKE(l) and SNAKE(2) with up to 11 rounds. Moreover, 
when the block size is 128 bits and m = 16, all round keys are recovered for 
SNAKE(l) with up to 15 rounds and SNAKE(2) with up to 16 rounds. 

References 

1. E. Biham and A. Shamir, “Differential Cryptanalysis of DES-like Cryptosystems,” 
Journal of Cryptology, Volume 4, Number 1, pp.3-72. Springer- Verlag, 1991. 



Interpolation Attacks of the Block Cipher: SNAKE 



289 



2. T.Jakobsen, “Cryptanalysis of Block Ciphers with Probabilistic Non-Linear Rela- 
tions of Low Degree,” Advances in Cryptology - CRYPTO’98, Lecture Notes in 
Computer Science 1462, pp. 212-222, Springer- Verlag, 1998. 

3. T.Jakobsen and L.R.Knudsen, “The Interpolation Attack on Block Ciphers,” Fast 
Software Encryption, FSE’97, Lecture Notes in Computer Science 1267, pp. 28-40, 
Springer- Verlag, 1997. 

4. L.R.Knudsen, “Block Ciphers - Analysis, Design and applications,” phD thesis, 
Aarhus University, Denmark, 1994. 

5. C.Lee and Y.Cha, “The Block Cipher : SNAKE with Provable Resistance against 
DC and LC attacks,” In Proceedings of 1997 Korea-Japan Joint Workshop on In- 
formation Security and Cryptology (JW-ISC’97), pp.3-17, 1997. 

6. M.Matsui, “Linear Cryptanalysis Method for DES Cipher,” Advances in Cryptology 
- EUROCRYPT’93, Lecture Notes in Computer Science 765, pp. 386-397, Springer- 
Verlag, 1994. 

7. M.Noro and T.Takeshima, “Risa/Asir - a computer algebra system,” Proceedings 
of ISSAC’92, pp.387-396, ACM Press, 1992. 

8. K.Nyberg and L.R.Knudsen, “Provable Security Against a Differential Attack,” 
Journal of Cryptology, Volume 8, Number 1, pp. 27-37, Springer- Verlag, 1995. 

9. V.Rijmen, J.Daemen, B.Preneel, A.Bosselaers, and E.De Win, “The cipher 
SHARK,” Fast Software Encryption, FSE’96, Lecture Notes in Computer Science 
1039, pp. 99-1 12, Springer- Verlag, 1996. 



Appendix 



(xg.xsAO) (0,0, 0,0) (xs,xs,0,0) (0,0,0,0) 




15 th round 




C y ) 



Fig. 3. A chosen plaintext attack Fig. 4. A chosen plaintext attack 
of SNAKE(2)(8,4,64,9) of SNAKE(2)(16,4, 128,15) 




High-Speed Pseudorandom Number Generation 
with Small Memory 



William Aiello^, S. Rajagopalan^, and Ramarathnam Venkatesan^ 

^ ATT Labs - Research, Florham Park, NJ, U.S.A. aielloSresearch. att . com^ 
^ Telcordia Technologies, 445 South Street, Morristown, NJ 07960, U.S.A. 
srajSresearch.telcordia. com 

® Microsoft Research, One Microsoft Way, Redmond, WA, U.S.A. 
venkie@microsof t . com^ 



Abstract. We present constructions for a family of pseudorandom gene- 
rators that are very fast in practice, yet possess provable strong crypto- 
graphic and statistical unpredictability properties. While such construc- 
tions were previously known, our constructions here have much smal- 
ler memory requirements, e.g., small enough for smart cards, etc. Our 
memory improvements are achieved by using variants of pseudorandom 
functions. The security requirements of this primitive are a weakening 
of the security requirements of a pseudorandom function. We instantiate 
this primitive by a keyed secure hash function. A sample construction 
based on DES and MD5 was found to run at about 20 megabits per 
second on a Pentium II. 

1 Introduction 

We present a simple and practical construction for generating secure pseudo 
random numbers in software, improving on the prior work ^ by reducing the 
memory requirement significantly. The algorithm presented in ^ is efficient 
and possesses provable strong cryptographic and statistical unpredictability pro- 
perties. The approach was to start with a slow but cryptographically secure 
pseudorandom generator based on a strong block cipher and then stretch the 
outputs using the intractability of a certain coding problem and the rapidly 
mixing property of random walks on expander graphs. In a typical configura- 
tion, this generator implemented in C ran at speeds up to 70 Mbits/second on 
a 200MHz PentiumPro running WindowsNT 4.0. This generator was also incor- 
porated into an IPSec testbed and integrated with a high quality duplex video 
teleconferencing system on Unix and NT platforms. This typical configuration 
required 16 to 32 Kbytes of memory which, unfortunately, is quite significant for 
devices with a limited amount of memory such as smart cards or cell-phones. 

Our improvement here is achieved by replacing the stretching mechanism 
in IP by using what we call random-input pseudorandom (family of) functions 
(ri-prf) or hidden-input pseudorandom (family of) functions (hi-prf). These 

t Work done while at Telcordia. 
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functions accept random inputs and map them into longer pseudorandom out- 
puts. The former allows the adversary to see the random inputs while the latter 
does not. Hence their security requirements are weaker than usual pseudoran- 
dom functions (prf) that must map adversarially and adaptively chosen inputs 
into pseudorandom outputs. The existence of a ri-prf is equivalent to the exi- 
stence of a PRF (and both are equivalent to the existence of a one-way function 
by im and m)- However, while every PRF is also a ri-prf, not every ri-prf 
is a PRF. For example, consider the following simple modification to a prf. On 
input 0, the modified function always outputs zero. Such a modified function 
can obviously be distinguished from a random function with one query chosen 
by the adversary. However, the modified function remains a ri-prf. 

We suggest that ri-prf’s can be instantiated in practice by well known secure 
hash functions such as MD5, and SHA-1. The current attacks against these hash 
functions (0, 0, m, i2Di) do not contradict their being secure ri-prf’s. 

We note that secure hash functions have been used to construct different pri- 
mitives. For example, ^ construct prf’s on arbitrary length inputs from prf’s 
on fixed length inputs which they instantiated by well-known compression func- 
tions. 0 construct a message authentication code (an unpredictable prf family) 
from well-known compression functions. nq have independently explored several 
variations of the prf model given by varying the requirement between predic- 
tability and indistinguishability, and varying the query model from adaptively 
chosen to random. While our ri-prf model is the same as one of their variations, 
our Hi-PRF model is new. 

Our implementation in C using MD5 and DES ran at 28 to 40 Mbits/sec 
on a 233 MHz Pentiumll running Windows NT 4.0. The memory requirement 
ranges from 128 Bytes to 3 KBytes. The algorithm can be adapted for encrypting 
packets that are sent across a network and may be received out of order. 

2 The Generator Gvra 

In 0, a generator named VRA (“Video Rate Algorithm”) was presented. We will 
denote this generator by Gvra and briefly recall its construction for purposes of 
comparison. The generator presented in this paper is denoted G and described 
in the next section. In addition, we will present some improvements to Gvra 
itself in this paper in section tb.4l 

Gvra has parameters (n, k, L). It starts with a short preprocessing step when 
it runs a slow but secure generator, denoted by Gg, the output of which is 
subsequently stretched to a much longer output during the run of the algorithm 
in a way many desirable properties are preserved. The secure generator Gg may 
be based on a one way permutation (owp) or block cipher. The computational 
cost of the preprocessing step is amortized over a long output. The outputs of 
Zi, Z 2 t ■ ■ T Zm are each L bits long. The overall generator needs {n + l)L bits of 
memory and it takes roughly k simple machine operations like shift, exclusive- 
or and table look-up to generate each machine word-sized block. The relevant 
security parameter is ()() where n and k can be chosen so that (^) is large enough 
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to be considered infeasible (e.g. and negligible. Note here that n is not 

UJ 

the length of the seed input to the strong generator. Also, n and k can be varied 
within certain bounds to provide a smooth time-space tradeoff. 



2.1 Construction of Gvra 

The approach in Q is to begin with a few bits from a high quality source with 
cryptographic properties. The algorithm “stretches” these bits to a much longer 
sequence while maintaining many desirable properties. 

The generator Gvra has a simple description. First, let us assume we are 
given a possibly slow but perfect or cryptographically secure generator Gq (it 
was shown in how to construct one from a strong cipher such as DES or 
Triple-DES). Let x ^ Go(-) denote that x is assigned a random string of length 
|cc| by making a call to Go and obtaining the next block of random bits of 
appropriate length. Below, we use a d-regular expander graph with 2^ nodes 
each labeled with a L-bit string. Since the graph is d-regular, it can be edge- 
colored with colors {!,..., d}. For any node y, 1 < r < d, let neighbor{y,r) 
denote the neighbor of y reached by traversing the edge colored r. Gvra has 
two processes Pg and Pi each producing a sequence of L-bit numbers. Here © 
stands for bitwise exclusive-or of strings of equal length. A[i] denotes the Lth 
row of matrix A. p is a probability parameter S [0, ^]. 

Initialization for both processes 

Po : A^Go(-),^G{0,l}">^^. 

Pi : do ^ Go(‘)jdo G 
On-line steps for i := 1,2, . . . . do 

Po : Cl, . . . ,Ck <— Go{-),Cj € {0, 1, . . . , n — 1 }, Cj distinct 
■= ®j=i^[cj], Xi G {0,1}^ 

Pi : 6^ Go, 5 G {0,l}r‘°Srl. 

if (6 = 00 ... 0) yi := yi-i [i.e., stay at the same node with probability p] 
else r •<— Gq, rGll, . . . , d}, yi := neighbor(iji-i, r) [i.e., move to a random 
neighbor] 

Output Zi := Xi © yi- 

Note that for process P\, the random walk on an expander, the computation 
of each neighbor of a node must be both efficiently computable and require small 
memory. The Gabber-Galil 0 expander, defined below, is the most efficient in 
computation and memory of all known expanders. The neighbors of {x, y) are 
(x, X + 2y + {0, 1, 2}) and (2x + y + {0, 1, 2}, y) where x and y are £ = L/2 bit 
strings and arithmetic is done mod 2^. 

The generator thus stretches fcflogn] + [logd] + [log bits (all logs are 
binary) to L bits. For example, for the setting (n = 256, fc = 6, d = 6,L = 
512, p = i) the stretch factor is approximately 10. The properties of the gene- 
rator are detailed in P]. 



High-Speed Pseudorandom Number Generation with Small Memory 293 



2.2 The Large Memory Requirement of Gvra 

Note that Gvra requires a matrix A of n x L random bits. A typical setting for 
achieving high speed and reasonable security is n = 256 and L = 1024 P, i.e., 
an A of size 32 Kbytes which is significant and may result in two difficulties. 
First, the memory requirement may be unacceptable to small devices such as 
cell phones, handheld computing devices, and even low-end set-top boxes. On 
platforms such as smart cards, a realistic size for available memory is about 
1Kbyte. Second, the cache architecture on modern CISC CPUs such as i86xx is 
such that data structures that do not fit in a cache page suffer from high page 
swap penalties. Gvra suffers from this problem as the random row accesses to the 
matrix A cannot be effectively pipelined. The survey in m recommends that 
tables should not be larger than 4Kbytes. Furthermore, for applications such as 
e-commerce transactions which require both high speed and semantic security 
(i.e. every bit must be unpredictable to the adversary even if all the other bits 
are known) , the proof in jHj shows that the memory requirement of Gvra is much 
larger. There seems to be no easy way to configure Gvra with small memory and 
achieve high security. 

In the following section, we show how to reduce the memory requirement to 
about 1Kbyte while still achieving high security and reasonable speed. We do so 
by introducing a new primitive called a random-input pseudorandom function. 

3 Construction and Properties of the Generator G 

As in the case of Gvra, our approach is to begin with a few pseudorandom bits 
from a high quality source with strong cryptographic properties and “stretch” 
these bits to a much longer sequence while maintaining security properties. For 
this we will use an additional assumption. To begin with, we consider using 
a pseudorandom function (prf) [I I j . However, as we will see, we do not need 
the full security of a prf. We will be able to use a length-increasing function 
with a weaker security requirement which we call a random-input pseudorandom 
function. Whereas in the case of a prf, its outputs must appear to be random 
to a computationally bounded adversary even when the adversary makes adap- 
tive queries to the function, for a length-increasing random-input pseudorandom 
function (ri-prf) we only require that the outputs appear to be random to a 
computationally bounded adversary when the inputs are chosen at random. The 
expectation is that weaker security requirements for a random-input pseudoran- 
dom function may be traded off for greater efficiency. 

Next we will describe our construction of a pseudorandom generator G as- 
suming the availability of a ri-prf. In Section 14.21 we formally define ri-prf’s 
and suggest implementations. 

3.1 The Construction of G 

Our generator is a simple modification of Gvra- G takes parameters £, mi,m 2 , 
and n. As before, we assume we have a cryptographically strong prg Gq. We 
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also assume that we are given a random-input pseudorandom function family 
with parameters I, mi and m 2 , / : {0, 1}^ x {0, 1}™-^ — >• {0, l}"*^,m 2 > mi. We 
write /ic(x) = f{K,x). The parameter n is typically less than 10. Let L = nm 2 - 
The Pi process for G is exactly the same as that for Gvra- 

A simple description of Pq is as follows. In the pre-processing step, randomly 
choose n keys of length I for the n random-input pseudorandom functions. On 
each on-line step, choose a random input of length mi and feed it to each of the 
ri-prf’s. Xi is the concatenation of all the outputs of the n functions. 

We describe the new Pq process formally below. 

Initialization Pq : Ki,K 2 t ■ ■ ,Kn ^ Go{-),\Ki\= i. 

On-line steps for j:=l,2, .... do 

Po : B^Go(-),Se{0,irA 

x^ := fKi{B)o---o 

As in Gvraj the *-th output of G is Zi := Xi © yi where yi is the ith output 
of Pi- 

Glearly, the memory requirement of Pq is < nx i + mi bits (not counting the 
internal memory requirements of / which is used as a blackbox) . In a prototypical 
implementation using MD5, m 2 = 128 mi = 64, i = 512 — 64. Typically n = 
4 and hence the memory needed is about 0.5Kbyte. In the on-line steps, the 
generator thus stretches mi + [log + [logd] bits to L bits where L = nx m 2 - 
For example, for the setting (n = 4, d = 6,/ = 448, mi = 64:, m 2 = 128, p = i) 
the stretch factor is approximately 8. 

Outline: In section^ we give definitions and conventions used in this paper we 
show that the process Pq has strong security properties. In section 0 we describe 
the properties of the full generator. In section 1.1 ..SI we describe some variations 
in the construction of G and Gvra and their usefulness in specific scenarios. 



4 Definitions and Notations 

4.1 Preliminaries 

For any string x G {0, 1}*, let |i:| be its length and let o denote the concatenation 
operator. We first define strong one-way permutations. Let / : {0, 1}* — >■ {0, 1}* 
be a length-preserving function that is easy (e.g. polynomial time) to compute. / 
is a permutation if it is one-to-one as well. Let T : IN — >■ ® and e : IN — >■ [0, 1]. The 
security of / is (T(n), e(n)) if the following statement is true for infinitely many 
n. For a given n, for all probabilistic adversaries A that run for time < T{n), 
Prob[A(/(a::)) G f~^{f{x))] < e(n) where the probability is taken over the input 
X G {0, 1}" and the coin-flips of A. f is said to be a “strong” one-way function if 
it has security where w(l) is any function on integers that goes 

to infinity asymptotically. For ease of exposition, whenever n is understood, we 
will use T and e in place of T{n) and e{n), respectively. 
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A pseudorandom generator g accepts a short random seed x of length, say n, 
and produces a sequence of bits g{x) = 6i, & 2 , • ■ ■ , for some m > n. g is defined 
to have bit security (Ti,{n) , eb{n)) if, for any i,l < i < m, any probabilistic 
algorithm that is given b\, . &i-i, after running for time < Tf, cannot predict 

bi+i with probability > ^ Analogously, a pseudorandom sequence bi, . . . ,bi 

is defined to have security (Ts,es) if any probabilistic algorithm, after running 
for time Tg cannot distinguish b\ . . .bi from I truly random bits with probability 
> l + Cg. Yao |22| showed that the bit security of a generator and the security of 
any I bits of its output are tightly related: ^ g is cryptographically strong 

(or simply strong) if its security is such that whenever Tb{n) is feasible €b{n) is 
negligible. Hence, if g is strong, no efficient algorithm can distinguish g from a 
random source with better than negligible probability. 

Goldreich and Levin m gave a construction for a PRG using any one-way 
permutation such that the bit security of the generator is nearly the same as 
the security of the underlying one-way function. In a similar construction 
for a PRG is given for any cipher (which is secure in an appropriately defined 
model) such that the bit security of the generator is nearly that of the cipher. 
In our construction of G given in section 0 we assume that Gq is such a strong 
generator. 



4.2 Pseudorandom Functions 

Pseudorandom functions were defined in m- They presented a construction 
that makes as many calls as the input length to a PRG that doubles the input 
length. Recently, a more economical construction based on the Decisional Diffie- 
Hellman assumption was given in PI- However, all constructions whose security 
is reducible to the security of well-known number theoretic problems are too 
slow for many applications. In situations which require an efficient prf, a keyed 
cryptographic hash function or a block cipher (with Merkle’s construction which 
destroys the invertibility) are often used. Rather than the security of the prf 
being reduced to that of a well known problem, it is simply be asserted on the 
strength of empirical evidence. In such situations it seems prudent to try as much 
as possible to reduce the security requirements of the cryptographic primitive 
whose security is being asserted. This is the approach we take here. 

Let us first recall the definition of a prf. Traditionally, prf’s are used as 
length-preserving functions but we will allow the general version here. A fun- 
ction family with parameters (/c, mi, TO 2 ) family .F is a collection of efficiently 
computable functions fK mapping mi-bits to m 2 bits indexed hy K G {0, 1}*. 
Given the string K one can easily compute the function ffc ■ We will assume that 
the parameters of the family can be increased suitably and without any bound. 

Definition 1 (Security of Prf) A pseudorandom function family T has secu- 
rity {T,U,S) if, for every valid mi, no probabilistic adversary A given access to 

^ Bit security of g has also been defined as st, = ^ (see,e.g., M) but we prefer the 
two-parameter model as it is more transparent. 
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an oracle of fx ^random IFmi running in time T and making at most U que- 
ries to the oracle, ean distinguish the oracle from a random function oracle with 
probability (over the adversary’s coin flips and choice of K) >6. 

We observe that a slight generalization of the classical definition of a prf 
family is to allow the polynomially bounded adversary access to many instances 
fxi (the number is determined adaptively by the adversary) where the advers- 
ary does not know any of the Ki’s but can query any fxi at the cost of a 
query each. Next, we define a weaker notion of prf that we suggest is useful 
and perhaps more tenable. This notion was independently proposed as random- 
challenge pseudorandom function in im.A function family is random-input pseu- 
dorandom (ri-prf) if the adversary has to distinguish a member of this family 
from a truly random one on random queries. That is, the adversary does not 
adaptively choose the queries as in the definition of a pseudorandom function 
but rather random query points are chosen for her. More formally, the adversary 
sees a sequence {x\, fxixi)), . . . , {Xu, /ic (a;„)) where each Xi is chosen uniformly 
at random from {0, and the key K G {0, 1}^ is chosen uniformly and is 
unknown to the adversary. 

Definition 2 (Security of ri-prf family) A function family T is random- 
input pseudorandom with security (T, U, i5) if for every valid mi, no probabilistic 
adversary A given access to a random-input oracle of f Grandom -?>ni running in 
time T and seeing at most U can distinguish U input-output pairs for f^ from 
U input-output pairs for a truly random function where the input queries are 
random with probability (over the adversary’s coin flips, and choice of K and 
Xi ’s) > 6. 

One can trivially get a strong prg g from a ri-prf and a pseudorandom source 
(or a long enough seed), g uses the pseudorandom source to select a random 
member of the ri-prf family and an intial random input. Iteratively, for every 
successive block of pseudorandom bits to be output, g flips as many bits as the 
input length of the ri-prf and outputs these bits as well as the value of the 
RI-PRF on that input. One can stop here and use this construction for a strong 
generator if one can find concrete functions that are efficient in practice and 
can be modeled as ri-prf’s with good security. However, the ri-prf assumption 
may be too strong for some concrete functions and we would like to weaken the 
assumption of security further. To this end, we first note that the security of a 
RI-PRF is at least as much or higher if the input was hidden from the adversary. 
We call such a function a hidden-input pseudorandom (hi-prf). More formally, 
we define an hi-PRF family as follows. 

Definition 3 (Security of hi-prf Family) A function family T is hidden- 
input pseudorandom with security (T, U, i5) if for every valid mi, no probabilistic 
adversary A given aecess to a random-input oraele of f Grandom -^mi running in 
time T and seeing at most U ean distinguish U outputs of fx from U outputs of 
a truly random function where the input queries are random and unknown with 
probability (over the adversary’s eoin flips, and choice of K and Xi’s) > 6. 
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A further variation on hi-prf is when the random inputs are chosen without 
replacement. Note that drawing with or without replacement for ri-prf is an 
inessential distinction since the adversary can see when an input is repeated. 

4.3 prf’s vs ri-prf’s vs hi-prf’s vs pro’s 

Any RI-PRF or HI-PRF is a one-way function as is clear from the definition. 
Thus, the notions of prf, ri-prf, hi-prf, and PRG are equivalent in terms of 
their existence to that of one-way functions. Nevertheless, we show that there 
exists a function which is an hi-prf but not a ri-prf, and another that is a 
RI-PRF but not a prf. 

First, let us assume we are given a pseudorandom function / from n bits 
to n bits. Note that the simple function Fk{x) := (x,fK{x)) is an hi-prf 
from n bits to 2n bits. However, it is not a ri-prf since the input, seen by 
the adversary, and output are obviously correlated. A ri-prf which is not a 
PRF can be constructed as follows using a standard two-round Feistel network: 
Fki,K 2 (x, y) = (z, /r 2 i^) ® where 2 = /k^ (x) © y. This function is not a prf 
since, for inputs (x,yi) and (a;, 1 / 2 ), Zi (B Z 2 = yi (B 1 / 2 , and thus the adversary 
who can choose these inputs can easily distinguish this function from a random 
one. However, for random choices for x and y, using ideas of m we can show 
that the outputs are indistinguishable from random if / is a prf. 

Following the comment after Definition d we note that the model of secu- 
rity for RI-PRF and hi-prf can be generalized (as in the case of prf’s) to allow 
multiple independent instantiations that an adversary can query. Secondly, we 
note that any function that is a ri-prf is an hi-prf. The distinguishing pro- 
bability for an adversary who does not see the inputs in the hi-prf model may 
be significantly lower than when she can see the inputs in the ri-prf model for 
the same function. Finally, note that an hi-prf which is not length increasing 
is trivial (the identity function is one). 

Note that a secure hi-prf family is equivalent to a PRG whose outputs on 
correlated seeds are indistinguishable from random when the seeds are of the 
form K o Xi where K is random but fixed and the r^’s are random and indepen- 
dent. However, we can distinguish hi-prf’s from prg’s as follows. As before, let 
us say that we have a hi-prf family whose members are indexed with k = \K\ 
bits and that we have n independent instantiations of this family. Then, in i 
iterations, for an input oi k ■ n + i- mi random bits, we get i ■ n- m 2 output bits. 
In order to compare similar things, let us consider a PRG which takes n seeds of 
mi bits each and outputs i ■ n ■ m 2 bits in i iterations. The difference between 
the two is that the hi-prf construction uses an extra n ■ k bits which may make 
a difference in their security in the following sense. There may exist functions 
whose security is lower when used as a PRG (i.e. without the extra n ■ k random 
bits input) as compared to hi-prf’s. 

To make this concrete, consider the example of MD5 with n = 1, z = 2, and 
mi = 64 (implying k = 448 = 512 — 64. The difference between the two modes is 
as follows: in the case of the PRG, in each iteration we use MD5 with mi random 
bits and a fixed pad known to the adversary of length k . Contrast this with 
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the Hi-PRF mode of using MD5 where the k pad bits are chosen at random and 
are fixed and secret from the adversary. At each iteration, we run MD5 on mi 
random bits and this pad. It is quite conceivable that the bits output by MD5 
in the second mode are much more secure than in the first. 

define a pseudorandom synthesizer as a function whose outputs cannot 
be distinguished from random even when the adversary knows the outputs on a 
matrix of inputs generated as a cross-product of two lists of random values. Our 
construction may be seen as a one-dimensional case of a synthesizer where one 
of the two inputs is held constant while the other varies over a random list. 

4.4 Security of G 

Note that from the construction of G and the definition of security of hi-prf, 
if Go is a secure prg and if ■= fKi{x) o ■ ■ ■ o fK„{x) is a secure 

HI-PRF, then G is a secure PRG. In the next lemma we will show that if / is a 
secure ri-prf then F is a secure ri-prf. It follows immediately that F is also a 
secure hi-prf. 

Lemma 1 If /k, with K chosen at random, is a secure ri-prf with security 
(T, U, e) then Jki o . . . o fj^^ with Ki, . . . , chosen at random is a ri-prf with 
security (T',C7, e') where T' jt' < nTje. 

Proof Sketch Define to be the input/output sequence given by the compo- 
site construction. That is, one input/output pair looks like r, fx^ir ), . . . , /r-„(t) 
where r is random. Define Dq to the the input/output sequence given by n truly 
random functions, i.e., one input/output pair looks like r, Ri , . . . , i?„ where all 
the values are random. More generally, let Di be the input/output sequence 
defined by the following input/output pairs: r, /ki (f), . . . , fxi (f), Ri+i , . . . , Rn, 
where Ri+i , . . . , Rn, and r are random. Using a standard argument initiated in 
m, one can show that if there is an adversary distinguishing Dq from Dn with 
probability e' then there is some i such that the same adversary can distinguish 
between and Di with probability e'/n. This adversary can then be used to 
build an adversary distinguishing (with the same probability) sequences whose 
input/output pairs are given by r,fK(r), for r random, from sequences whose 
input/output pairs are given by r, R where both r and R are random. □ 

4.5 Random-Input Pseudorandom Functions from Hash Functions 

As noted earlier, when “provable” prf’s are often too slow in software one resorts 
to constructions based on cryptographic hash-functions like MD5 and SHA-1 
under the assumption that they behave like prf’s. As noted above milder security 
assumptions are preferable. We use the compression function of secure hash 
function here. These have considerably longer input buffers than the output. We 
limit the attackers ability in choosing the inputs, by fixing a substantial portion 
of the input buffer with a random string so that in effect the compression function 
on the remainder of the input acts as a length-increasing function. 
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Definition 4 (ri-prf’s from secure compression functions) A family 
iFmi 0 / ri-prf’s h : {0, iy+™i — >• {0, l}’"^ where nii < m 2 , is defined as follows. 
Let Mix{-) he a length preserving function that is easy to compute, and let K 
he randomly chosen from {0,1}^ Then, fx ■ {0, — >■ {0,1}™^ is defined as 

//f (x) := h{Mix{K o x)). 

One may choose Mix reflecting the beliefs about the security of the hash 
function. Examples of Mix{K, x) are (Kox), (xoK), {lefthalf {x)oKorighthalf {x)) 
etc. A successful attack on such ri-prf’s must work for most random choices 
of K by the user (not the attacker’s random choices). Our current knowledge of 
attacks on MD5 and SHA-1 suggest that it is reasonable to assume that their 
compression functions yield suitable ri-prf’s. Even if this assumption proves to 
be false, the construction of generator still holds as long as there exists a single 
efficient ri-prf. 

4.6 The Feedback Construction 

An alternative method for Pq to get the wi-bit inputs is by feeding back some 
of the output rather than by getting them from Gq. The output length in each 
iteration then becomes nm 2 — m\. In this section, we analyze the various ways 
in which output bits can be fed back into the input so that the construction can 
be iterated. For clarity, let us set n = 1, i.e. there are no multiple instantiations 
of the underlying function. The following discussion also holds for polynomi- 
ally many independent instantiations of the underlying function. To be more 
concrete, one can imagine that the function family is instantiated using a hash 
function like MD5 where the index is simply the random bits used in the pad. 
However, the arguments hold in general. First, let us consider the simplest vari- 
ant gi of the generator. Let {y)*^j and {y)^j be, respectively, the first and last 
j bits of a string y. Then, gi can described as follows: at the z-th step {i > 0), 
output j/i := f K{{yi-i ) with j/o being a random seed of mi bits. If fx is a 
member of a ri-prf family, gi is a strong generator. Because an hi-prf derived 
from a ri-prf by hiding the input is at least as and perhaps much more secure, 
we can claim that the following generator 52 has the same or more securitythan 
gi. At the z-th iteration, yi := f K{{yi-i)) if * is the last iteration, output 
yi, else output (z/*)^(„^ 2 _„^). 

Finally, we can show that the following generator g^ has higher security than 
(72 under the assumption that we have access to a pseudorandom source go whose 
security is higher than the security of fx. gs can be described simply as: at the 
z-th step, go outputs a random Xi of length m\ and go outputs fx{xi). Let e be 
the probability of distinguishing these mi feedback bits from random. It can be 
shown by induction that after t iterations, the probability of distinguishing these 
mi feedback bits is now only bounded by te. Thus, the security of t outputs of 
the alternative Pq is degraded by this amount. Let us compare this to using Gq 
instead of truly random bits for t outputs of the standard Pq , suppose that the 
security of mi bits of Gq is e'. It can be shown that the security of t outputs of 
Pq is degraded by te'. Thus, whenever Go is more secure than the feedback bits 
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of Pq , using Go will result in a more secure (albeit, potentially slower) generator. 
The Go we will use in our implementations is based on a strong cipher such as 
DES and possibly much more secure than any mi bits of the Pq • 

5 Properties of G 

5.1 Dealing with Birthday Collisions 

As shown in the previous section the process Po in-and-of-itself produces a 
cryptographically strong pseudorandom sequence. However, in practice, when 
actually setting the parameters so that Po is very fast, the value of mi is re- 
latively small so as to reduce the load on the strong pseudorandom generator. 
However, in such a case, mi may not be large enough to avoid birthday col- 
lisions. By the birthday paradox, in Ri steps, some Xi is quite likely to 

repeat, which would help to distinguish outputs of Po from a truly random se- 
quence of comparable length. A similar problem occurred in [Q. The solution we 
propose here is the same as for Gvra- lu parallel with Po we preform a random 
walk Pi and then take the bitwise xor of the two outputs as the output of the 
generator G. This is precisely the generator described in Section 0 

The intuition behind this construction is as follows. For a random walk on 
an expander of 2^ nodes, the probability that the walk is at the same node 
that it was t steps ago is very small, say * for some c > 0. Hence for large 
enough t this probability is small and offers a good “long range” decorrelation 
properties, while in the short range, an output of Pi can easily repeat. However, 
the probability that Pq returns in t steps is negligible for small t (“in the short 
range”) and as mentioned above in the long range the choice mi may make the 
output repeat. The idea is that by xoring the two processes, the probability of 
return will be relatively small for all values of t. 

Before we quantify these remarks, we observe that adding Pi to Pq did not 
weaken the cryptographic security of Pq . Indeed, if we are given a distinguisher 
D for G we can attack Pg as follows: Generate Pi of suitable lengths inde- 
pendently and xor it with the output of Pg . Now D will distinguish this from 
random strings; however this would be impossible if Pg were purely random. 

5.2 Return Time for G 

A generator is said to enter a cycle of period M after a transient period of Mg 
steps, if Zi = Zi+M for all i > Mg, for some Mg. It is desirable for a PRG to have 
large cycle length M, since it gives an upper bound on the maximum number 
of useful outputs that can be obtained from G with a single seed. A generator 
with security (T(n),s(n)) cannot, on average^ have cycle length substantially 
smaller than T(n)(l — e(u)) since one can distinguish the generator’s output 
from random sequences with certainty in time proportional to the cycle length. 
However, this does not preclude the generator from having small cycle lengths on 
some small fraction of inputs (seeds). In this section, we show that the probability 
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that G repeats an input is very small. Note that repeating an earlier output is 
necessary but not sufficient for a generator to be in a cycle. Using the theory of 
random walks on expanders, showed that the probability of Gvra repeating 
its starting point is negligible. In fact, it is smaller than either the probability 
that Pq repeats or the probability that Pi repeats. A similar result holds for the 
probability that G repeats its starting point (or any other point for that matter) 
and is given in the lemmas below. To state the lemma we let V{m) and W{m) 
be the probability that Pq and P\ repeat after m steps. 

Lemma 2 The probability that G repeats its first output at the m-th step, for 
m>2, 

V{m)W{m) < R{m) := Prob[Zm = Zi] < min(y{m),2~^ + A™(1 — 2~^)) 

where the probability is over the choice of inputs to Pq and the random walk at 
each step. Moreover, A 2 is the second largest eigenvalue of the expander graph 
in P\ in absolute value. 

The following lemma gives a lower bound on Wfm) in terms the degree of 
the expander. 

Lemma 3 (Alon-Boppana) The probability W{m) of return of Pi in m steps 
is bounded below as: 

Prob[Ym = Yi] > — p)p{m — 1) for odd m > 3 

> d~'^p^ p{m) for even m > 2 



where p is the probability of staying at the same node, p{2r) = ^{^^_i)d{d—lY ^ , 
r >2. (It is easy to compute W (m) exactly for m = 1, 2, 3.^ 

Random walks on expander graphs have other strong statistical properties. 
For example, they generate a sequence of numbers that are statistically almost 
as good as random and independent numbers for use in Monte-Garlo estimations 
as was shown by Gillman mg. That is, the mean of any bounded function over 
the node values can be estimated with an error which decreases exponentially 
with the number of samples. This was generalized in m to the case when the 
random walk node label was xored with the output of the Pq given in that paper 
and it extends to our case with G. 



5.3 Alternate Ways to Avoid Birthday Repeats 

Note that the security is not independent of m 2 . Indeed, after input-output 

pairs, some output is likely to repeat whereas for a true random source this 
does not happen until 2™^/^ outputs. This is taken care of by the Pi process. 
However, the Pi process can be difficult to handle for small devices since it 
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involves arithmetic modulo large numbers. We suggest some new ways of avoiding 
the birthday attack without using Pi. 

The essential idea is that the birthday collisions Pq works in time 
This is because we have fixed I bits of the input. However the ri-prf may have 
significantly more security than We can exploit this additional security 

by modifying the construction as follows. Some number of bits, say m3, of the 
m2 output bits are fed back and only m2 — m3 bits are output. These bits are 
temporarily stored until enough bits are accumulated to form a new n x I matrix 
A of key values. It takes Inl/m^] rounds to compute this new A. After this 
new A has been computed, the key values for the random-input pseudorandom 
functions are given by this A. Since Inl/nis] is much less than 2™'^/^, the birthday 
attack is thwarted. In fact, it can be shown that the security of the scheme can 
be reduced to syntactically weaker security requirements for the ri-prf since the 
adversary is only given \nl /m^] queries per random key with which to distinguish 
the outputs of the function from random. We should also note that the bits for 
generating the new A need not be feedback bits alone. It may be desirable for 
some or all of these bits to come from Gq. The decision about where to get the 
new bits of A depends on the details of the security limitations of Gq and the 
random-input pseudorandom function. 

One need not wait until all of the bits of the new A are ready before incor- 
porating some of these bits into the new keys. In fact, the feedback bits could 
be used as new bits for the key values of the next round. That is, at each on-line 
step, we output the value of a new function fx ■ {0, 1}™^ — >■ {0, 1}™^ where 
K changes slightly at each step. For the MD5 construction, m3 could be 32, 
and the buffer is “rotated” right with the extra 32 bits appended to the buffer. 
Heuristically, this appears to be a better scheme than that above since the keys 
change at every step, but we are not able to prove this formally. 

5.4 Variations on Gvra 

A similar idea to the above for reducing birthday collisions can be used for 
Gvra- The n X L matrix A can be interpreted as a linear array. The feedback 
bits can be appended to the linear array to form a new linear array from 1 to 
nx L + m3. A new matrix can be fashioned from the linear array from locations 
m3 to n X L -|- m3. Alternatively, each row of A can be “rotated” right using 
some of the feedback bits. 

We also suggest some changes to Gvra that make it more suitable to a larger 
set of applications. Recall that the matrix A is completely filled in the pre- 
processing step with random bits. As noted earlier, this may be a large cost for 
some applications. The case of inadequate memory has been addressed with G. 
Here we provide a solution for applications that occasionally need only a short 
string of pseudorandom bits or need to amortize the overhead of the preproces- 
sing step. For example, if the number of bits needed is much less than n/k x L, 
then some rows of A will not be used. For this scenario, we offer the following 
alternative. We do not fill A in the pre-processing step. Instead, each row has 
a “filled” bit which indicates whether that row has been initialized or not. In 
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each on-line step, we first generate the random choice of rows. For every row 
thus chosen, we first check the “filled” bit for that row and if it is not set, we go 
ahead and generate the random bits of the row. Thus, only those rows that will 
actually be used will be filled in A. 

6 Implementation 

In this section, we present the results of our sample implementation on a Pentium 
II 233 Mhz workstation Windows NT 4.0 with Visual C-I-+ 5.0. The algorithm 
was implemented entirely in C with no heroic attempts being made for opti- 
mization. In order to implement the generator, we have to choose instances of 
a strong pseudorandom generator and a ri-prf. For the strong generator, we 
chose two examples: first, we used the outputs of single DES in OFB mode. We 
also implemented the Goldreich-Levin generator based on Triple DES in OFB 
mode. For comparison, we also implemented the strong generator using “alleged” 
RC4. For the ri-prf implementation, we chose MD5 and SHA-1. These choices 
constrain the parameters of our implementation of G as follows. In the case of 
MD5, the input buffer is of length 512 bits and the hashed output is 128 bits 
long. For n parallel instances of MD5, the memory requirement is around 512n 
bits. For small devices one can take n < 8 with a memory requirement ranging 
from 64 to 512 bytes. We chose m-i = 64, which is the block length of DES. 

We used the public domain implementations of DES and MD5/SHA-1. The 
raw speeds of MD5 and SHA-1 were about 45Mbits/sec and 32 Mbits/sec, res- 
pectively. The raw speed of DES in OFB mode was found to be 25 Mbits/sec. 
The raw speed of the Goldreich-Levin generator based on Triple DES was 10 
Mbits/sec. For comparison, RG4 ran at about 64 Mbits/second on this plat- 
form. The following paragraph summarizes the speeds for various combinations 
of our generator parameters using DES or RG4 as the initial generator Gq. 

Here we summarize the speeds in Mbits/sec of our implementations for va- 
rious values of n. In all cases, the strong generator was DES in OFB mode, and 
the Pi process was used. We observed that the Pi process slows down the gene- 
rator by about 10% for the parameter values tested. The Goldreich-Levin slowed 
down the generator by approximately 20%. We also noted that for higher values 
of n the speeds continued to increase but the memory requirement for Pq was 
more than 1Kbyte. The speeds for n = 1,2,4, 8 for MD5 were, respectively, 
23,30,35,38 Mbits/s. The corresponding figures for SHA-1 were 21,25,27,28 
Mbits/s respectively. For comparison we report the speed of G when using the 
RG4 as Gq. When n = 8, G ran at 41 Mbits/sec and 32 Mbits/sec using MD5 
and SHA-1, respectively. Although RG4 is much faster than DES, it does not 
improve the speed of G significantly. The conclusion is that the factor limiting 
the speed of G is the hash function: MD5 and SHA-1. 
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Abstract. SOBER is a new stream cipher that has recently been de- 
veloped by Greg Rose for possible applications in wireless telephony Pj. 
In this paper we analyze SOBER and point out different weaknesses. 
In the case where an attacker can analyze key streams generated for 
consecutive frames with the same key we present an attack, that in our 
implementation requires less than one minute on a 200Mhz Pentium. 



1 Overview and Motivation 

Encryption schemes that are used in wireless telephony have to meet rather 
difficult constraints. The schemes have to encrypt rather large amounts of data, 
but mobile stations often have very limited computational power. Hence the 
encryption scheme must be quite efficient. Simple schemes, however, are under 
a big risk of being insecure. 

In this paper we analyze the stream cipher SOBER, which was developed by 
Greg Rose 0. An improved version of SOBER U has recently been submitted 
for a TIA (Telecommunications industry association) standard. The goal of the 
TIA standards process is to replace ORYX and CMEA, for which weaknesses 
have been discovered |6I5| . 

The outline of the paper is as follows. Section describes the notation used 
in this paper. We review SOBER in Section 0 Section 0 analyzes SOBER under 
the assumption that an attacker knows the first 4 bytes of 64 key streams all 
generated with the same key and the frame numbers 0 through 63. This attack 
is the most severe attack described in our paper, since it requires only 256 bytes 
of known key stream and is very efficient. Our implementation often finds the 
key in less than a minute on 200 Mhz Pentium. That is we reduce the time 
complexity from 2^^® for a brute force search to about 2^®. Section 0 analyzes 
SOBER under the more restrictive assumption that an attacker knows only 
one key stream rather than several as assumed before. Section 0 describes a 
weakness in the key setup procedure. In particular, we show that using an 80 bit 
key instead of a 128 bit key can considerably simplify cryptanalysis. In Section 0 
we show how to find keys that produce no output. While this is not a severe 
flaw in SOBER, such keys might be used for a denial of service attack in certain 
protocols. 
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2 Notation, Underlying Mathematical Structure and 
Basic Operations 

Most of our cryptanalysis in this paper is based on heuristic arguments. For 
example, we often assume that certain values are uniformly distributed. We will 
call statements based on heuristic arguments claims as opposed to statements 
that are rigorously provable, which are usually called theorems. However, we 
have verified our claims if possible with computer experiments. 

SOBER uses operations over the ring Z/{256) and the field GF28 and mixes 
these operations. Hence, the computations depend heavily on the exact repre- 
sentation of elements in these structures. 

The field GF2S is represented as GF2 [a:]/(a;® -l-a:® -I- 1). Addition and 
multiplication over GF28 will be denoted by © and © respectively. The symbol 
‘+‘ will denote addition over the ring Z/{256) and ‘ A ‘ will denote the bitwise 
logical AND. Elements in Z/{256) and GF28 can both be represented with one 
byte. An implicit conversion (j) takes place when an element of GF28 is used for 
an operation over Z j(256). This conversion (j) can be defined by 

7 7 

CiX^) = ^ Ci2* where ci G {0, 1}, 

0 2—0 

i.e. the coefficients of polynomials in GF 2 [x\/ {x^ +x^ +x^ +x^ + 1) are interpreted 
as bits of the binary representation of integers in the interval [0,255]. For the 
rest of the paper will use (j) and (f>~^ as implicit conversions between GF28 and 
Z /(2b6) when necessary, e.g. we can use the constants 141 and 175 to represent 
the polynomials x'^ + x'^ + x^ + 1 and x"^ + x^ + x^ + x'^ + 1 respectively. 

3 Description of SOBER 

SOBER is a stream cipher having a key length up to 128 bits. A stream cipher 
generally transforms a short key into a key stream, which is then used to encrypt 
the plaintext. In particular, the ciphertext is usually the bitwise XOR of the 
key stream and the plaintext. A key stream must not be used to encrypt two 
plaintexts, since the knowledge of two ciphertexts encrypted with the same key 
stream would allow to extract the XOR of the two plaintexts. SOBER has a 
feature, called frames, that allows us to use the same key multiple times without 
generating the same key stream. This is very useful in applications where a lot of 
small messages have to be encrypted independently. In particular the key stream 
generator has two input parameters, namely the key and the frame number. The 
frame number is not part of the key. It may be just a counter, and is sent to the 
receiver in clear or is known to him. Hence we have to assume that the frame 
number is known to an attacker too. 

The main reason to chose a linear feedback shift register over GF28 instead 
of a LFSR over GF2 was efficiency PI- In this paper we consider the security of 
the cipher. Therefore we will not describe implementation details. 
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3.1 LFSR 

SOBER is based on a linear feedback shift register of degree 17 over GF 28 pro- 
ducing a sequence s„ that satisfies the recurrence relation 



S„+17 = (141 (g) S„+15) © S„+4 © (175 © s„). (1) 

During the key and frame setup this recurrence relation is slightly modified to 
add the key and frame number into the state of the LFSR. In particular the 
following relation is used. 

Sn+n = (141 © Sn+is) © s „+4 © (175 © (s„ + a„)) (2) 

The value a„ depends on either the key or the frame number and will be described 
in the next two sections. 



3.2 Key Setup 



A key consists of 4 to 16 bytes, Kq, . . . , Ki_i, where £ denotes the length of the 
key in bytes. The LFSR is initialized to the first 17 Fibonacci numbers, i.e. 

So = Si = 1 

Sn = s„_i + s „_2 (mod 256) for 2 < n < 16 

Then each byte of the key is added to the lowest byte of the LFSR, before the 
LFSR is cycled, i.e. a„ = K„, so that 

S „+17 = (141 © Sn+is) © s „+4 © (175 © (s„ + AT„)) for 0 < n < - 1. (3) 

Next, we set ai = £ and a„ = 0 for £ < n < 40 and compute the state 
(s 4 o, . . . , sse) of the register. This concludes the setup of the key. 

Setting ae = £ guarantees that the state of the LFSR after the initialization 
is unique for all keys. The key setup procedure is an almost linear operation. In 
particular, there exists a 17 x 18 matrix M independent of the key, such that 



^S4o'' 




S41 


= M 


\S56 / 





1 \ 

50 © (so + Oo) 

51 © (si + Oi) 

V Sl6 ® (si6 + die) ) 



(4) 
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3.3 Frame Setup 

In some application the same key is used to produce more than one key stream. 
In order to achieve this a frame number / is used to generate a different key 
stream for each time the key is used. A frame number is a 32-bit integer. It 
is added in II steps to the LFSR. In particular the n-th step adds the bits 
3n,. . . ,3n-|-7 to the lowest register of the LFSR and cycles the LFSR afterwards. 
Hence 



Sn+n = (141 (8) S„+15) © s „+4 © (175 © (sn + a„)) for 40 < n < 50 
where a„ = mod 256. 

The values : 58 < i < 96 are computed with dO- The frame initialization is 
finished as soon as the state (ssoj • ■ ■ j sge) is computed. 

3.4 Computation of the Key Stream 

The sequence is computed in a non-linear way from as follows: 

Vn = {Sn + Sn+2 + Sn +5 + Sn+ 12 ) © (Sn-|-12 A S„+ 13 ) 

Assume that the stream cipher is in a state n, i.e. represented by (s„, ..., s^+ie). 
Then the cipher produces the key stream as follows: n is incremented by 1 and 
Vn is stored in a special register called the stutter control register. Hence, ugi 
is the first stutter control byte if a frame number is used. The stutter control 
register is used to generate 0 to 4 bytes of the key stream as follows: First the 
stutter control register is divided into 4 bit pairs. For each of these bit pairs, 
starting with the least significant bits the following is done based on the value 
of the 2 bits: 



bit pair 


Action taken 


00 


Increment n by 1. (No output is produced.) 


01 


Output the key stream byte 105 © Vn+i and increment n by 2. 


10 


Output the key stream byte Vn +2 and increment n by 2. 


11 


Output the key stream byte 150 © Vn+i and increment n by 1. 



4 Analysis of the Frames 

In this section, we analyze the situation where an attacker knows different key 
streams generated with one key, but different frame numbers. We might regard 
the frame number as a part of the key. Hence, the attack described in this 
section could be classified as an related-key attack. Biham introduced this type 
of attack and applied it to various ciphers in P . We use a the fact that different 
frames are strongly related to derive an efficient differential attack | 2 | based on 
this relation. It is the most serious attack presented in this paper, since both 
the number of necessary known plaintext bytes and the number of necessary 
operations to recover the key are small. In particular, we show the following 
result: 
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Claim 1. Given the first 4 bytes of key stream for 64 frames generated with 
the same key and the frame numbers 0 through 63. Then we expect to find the 
key in about steps, where c is the number of steps in the innermost loop. 

Due to the nature of the attack the time complexity can vary and is hard to 
examine, because the attack is based on a search tree, whose size depends on 
given input parameter. Our attack is based on the assumption that the values of 
the first stutter byte are uniformly distributed, and hence that all 4 values for the 
first 2-bits occur with about the same frequency. We will for example assume that 
at least 24 out of 64 bit pairs have the value 01 or 11. If that condition is not met, 
we can run the algorithm again with a lower threshold, and hence a larger tree 
to search. To verify that our assumptions are reasonable we have implemented 
the attack. Recovering the key often requires less than minute on a 200Mhz 
Pentium. We also found that the algorithm indeed shows the expected behavior. 

Before we describe the attack, we will define some notations and point out 
some properties of the frame initialization that will be helpful for the attack. 

In order to distinguish the state of LFSR produced with different frame 
numbers / we will denote the nth value in the sequence s for / by 



Snif)- 

The differential of two such sequences will be denoted by Z\„(/i, /2), i.e. 

^n(/l,/2) := Sn(/l) © S„{f2). 

The j-th byte of the key stream for frame / will be denoted by 

Pjif)- 

The 6 least significant bits of the frame number are used twice during the 
frame setup. They are added to S40 and S41 during the computation of S57 and 
S 5 S- Therefore, if fi and /2 differ only in the 6 least significant bits, and the 
attacker can guess S40 and S41 then he can easily compute Z\„(/i,/2) for all 
n > 0 by 



^n{fij 2 ) = 0 for 0 < n < 56 

^n+i 7 {fi,f 2 ) = (175 © (s„ + a„(/i)) © (s„ + an(/2))) for n = 40,41 
^n+nifl, f2) = (175 © Z\„(/i, /2)) © Z\„+4(/i, /2) © 

(141 © Z\„+i5(/i, /2)) for n > 42 

We can now give a short description of our attack. First we guess S40 and 
S41 and compute the Z\’s. Then we guess certain well chosen bits of the internal 
state of the LFSR for the frame 0. Based on these guesses and the precomputed 
Z\’s we compute the corresponding bits for the LFSR state of the other frames. 
These bits are then used to compute possible output bytes. We don’t know the 
state of the stutter byte, hence we cannot predict with certainty what an output 
byte should look like. Therefore, we will use a probabilistic approach, i.e. we 
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reject our guess if considerably less output bytes coincide with our prediction 
than we would expect for a correct guess. Otherwise we extend our guess with 
a few more bits and test again. This can be repeated recursively until we have 
either found a reason to reject that guess or we know enough information about 
the LFSR state to compute the key. We will now give a more detailed description 
of the attack. 

Step 1 : Guessing S40 and S41. First we guess the values S 40 and S 41 . Next, 
we compute the values for /\„(0, /) for 82 < n < 99 and 0 < / < 63. 

Different pairs of values of S 40 and S 41 may lead to equal values for all 
Z\„(0,/). Since we will only need the A' s but not S 40 and S 41 we can use this 
fact to improve the algorithm. In particular the most significant bit of S 40 has 
no infiuence on 2\„(0, /). Moreover, the S 40 = 0 and S 40 = 64 give the same Z\’s. 
Hence, it is sufficient to chose S 40 satisfying 1 < S 40 < 127. Two values for S 41 
are equivalent if either their 3 least significant bits are all zero or if their fc > 3 
least significant bits are equal and the fc-th least significant bit is zero. It follows, 
that it is sufficient to chose S 41 among the following set: 

{0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 

25, 26, 27, 28, 29, 30, 31, 57, 58, 59, 60, 61, 62, 63, 

121, 122, 123, 124, 125, 126, 127, 249, 250, 251, 252, 253, 254, 255} 

Step 2 : Guessing « 82 ( 0 )» •S 84 ( 0 )» •S 87 ( 0 )» •*94(0) and 895(0). Now we would 
like to guess the values for sg 2 ( 0 ), Sg 4 ( 0 ), S87(0), 594 ( 0 ) and 595 ( 0 ). In order to 
improve the efficiency, we will start with guessing only the two least significant 
bits of each byte. We test that guess and if the guess looks promissing then we 
extend it recursively by guessing the next least significant bits and test again. 

Testing is done as follows. Assume that we have guessed the k least significant 
bits of 532 ( 0 ), 534 ( 0 ), 587 ( 0 ), 594 ( 0 ) and 595 ( 0 ). This allows us to compute the k 
least significant bits of 

s«(/) = Sn(0)©A„(0,/) 

for n G {82, 84, 87, 94, 95}. We can now compute the k least significant bits of 

V82{f) = (S82(/) + Ssiif) + Ssvif) + Sg4{f)) © (5g4(/) A 595(/)) 

With probability 1/4 the two least significant bits of the first stutter byte usi is 
01 and with the same probability it is 11. Hence, with probability 1/4 each the 
first output byte Pi{f) for frame / is either v^gif) © 105 or Vggif) © 150. Now, we 
count the number of frames / for which the k least significant bits of Pi{f) and 
either vsgif) © 105 or vsgif) © 150 are equal. We would expect to find about 32 
matches on average, if we have guessed the bits for 532 ( 0 ), 534 ( 0 ), 537 ( 0 ), 594 ( 0 ) 
and 595 ( 0 ) correctly. In our experiments, we rejected the values for 5 whenever we 
found less than 24 matches. The correct solution should be rejected by mistake 
with a probability less than 3%. 

After having guessed and checked all 8 bits, usually only few, typically less 
than 100 possibilities for the tuple ( 532 ( 0 ), 534 ( 0 ), 537 ( 0 ), 594 ( 0 ), 595 ( 0 )) remain. 
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We also know the values for the Z\’s and have partial knowledge about S4o(0) and 
S4i(0). Consequently, the most time consuming part of the analysis is already 
done. 



Steps 3.1 3.4: Guessing more of the LFSR state. Next, we extend 

each of these tuples by using a similar method involving Vs3{f),vs4{f),vs5{f) 
and vse{f). In step 3.i (1 < z < 4) we guess the unknown values out of 
S82+i(0), S84+i(0), S87+*(0), S94+j(0) and S95+j(0). Then we compute vs 2 +i{f) and 
compare these values to the known key stream. This comparison of W82+i(/) with 
the key stream slightly more complex than before, since we have to decide, which 
byte of the key stream we should use for comparison. Generally, we assume that 
if there was a match with vs2+j(f) for j < i that the corresponding byte was 
indeed generated using vs2+j(f)- We will use the following rules in our algorithm: 

— If a test Pj{f) = Vi{f)(Bx where x G {0, 105, 150} was successful then we use 
Pj+i{f) for the next comparison with Wi_|_i(/) otherwise we use Pj{f) and 
Vt+lCf)- 

— If Pj{f) = Vi{f) © 105 then we will not test Vi+i{f), since SOBER skips one 
byte after generating output of the form Vi{f) © 105. 

— A test Pj(f) = Vi{f) will only be performed if the last test was not successful. 

Step 4: Recovering the key At this point we know all the bytes Si(0) for 
82 < z < 98 with exception of 892(0) and 893(0). Usually, the previous steps 
narrow the possibilities to a few hundred cases. Hence we can easily just test 
each case with all possible values for 892(0) and 893(0). For each combination we 
compute Si(0) for z = 81, 80 , . . . , 16 using 

8i(0) = (175“^ © (8j+17) © (141 © Sj+ 15 ) © 8i+4). 

Finally, we recover the i — th key byte by 

ATj = 175“^ © (8j+4 © (141 © 84+15)) © Si+17) - 84. 

Remember, that 84 for 0 < z < 16 is equivalent to the z-th Fibonacci number. 
We may now do some further test such as decrypting the whole message to check 
whether we have found the correct key. 



A Very Rough Analysis 

It is hard to analyze this algorithm analytically. However, we roughly estimate 
it’s complexity as follows. We have to guess 840 (7 bits) and S41 (5.4 bits) and the 
2 least significant bits of 832(0), 834(0), 837(0), 894(0) and 895(0) (10 bits) before 
we can make the first test in step 2. Each test has to compare with 64 frames. 
Hence, so far we have a time complexity of approximately 2^+®-^+^°+®, which 
is slightly more than 2^®. Our experiments show that the first test cuts enough 
nodes in the search tree that the remaining tree is almost negligible. Hence this 
rough analysis gives an approximation for the actual runtime of the algorithm. 
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5 Analysis of the Cipher Stream 

In this section we analyze the nonlinear output of SOBER and investigate in how 
difficult it is to compute the internal state of the LFSR given the key stream of 
the cipher. In particular, we will now show the following result: 

Claim 2. Assume, that 17 consecutive key stream bytes pi, . . . ,pn are known. 
Assume further that these bytes are generated from two stutter control bytes and 
that the value of the first byte is IOIIIIIO2 and the second one is 010101012- 
Then it is possible to find the internal state of the LFSR in steps, where c 
is a small constant denoting the number of steps for the innermost loop. 



Let the first stutter byte be Vn. To simplify notation we’ll use U = Sn+i. 
From the assumption on the stutter control register the following is known: 

(io + O + fs + h2)©(ii2 A tis) = IOIIIIIO2 (mod 256) (5) 

{t 2 + 0 + 0 + ti4)®(^i4 A O5) — Pi (mod 256) (6) 

(O + O + O + ^15)0(^15 A fie) = (P2©100101102) (mod 256) (7) 

(O + O + fg + fi6)©(fi6 A fir) = (p3©100101102) (mod 256) (8) 

(te + O + fii +fis)©(fi8 A fig) = p4 (mod 256) (9) 

(fy + fg + fi2 + fig)®(fl9 A fgg) = OIOIOIOI2 (mod 256) (10) 

(fg + fio + fi3 + f2o)®(f20 A fgi) = P5©0110100l2 (mod 256) (11) 

(fio + fl2 + fl5 + f22)®(f22 A f23) = P6®0H0100l2 (mod 256) (12) 



The algorithm works as follows. First we guess the 9 bytes 
fo, f4, fsj fe7 fi2, fi3i fi5, f22 and fgs- 



We can use these equations to solve for more values U as follows: 



use equation 


to compute 




m 


fl7 = (141 0 fi5)©f4©(175 © fg) 






h = (pi©(fl4 A fis)) — (f4 + fy + fi4) 


(mod 256) 


m 


flO = (P6©0110100l2©(f22 A f23)) ~ (fl2 


+ fis + fgg) (mod 256) 


CD 


fig = (141 © fiy)©f6©(175 © f 2 ) 




CD 


fgi = (141 ^ © (f23)©fio©(175 © fe)) 




CD 


fs = f21©(141 © fig)©(175 © f 4 ) 




dD 


fgo (see below) 




CD 


fg = f22®(141 © f2o)©(l’75 © fs) 




C3) 


fy — (OIOIOIOI 2 ® (fig A fgo)) ~ (fg + fi 2 + fig) (mod 256) 


m 


fi 4 (see below) 




m 


fie (see below) 




CD 


fs — (p2©100101102©(fi5 A fie)) ~ (fs + fs + fis) (mod 256) 


CD 


fl8 = 141 ^ © (f20©f7©(l’75 © fg)) 




CD 


fi = 175“^ © ((141 © tie)©fs©fi8 




0 


fll = (P4©(fl8 A fig)) — (fe + fg + fls) 


(mod 256) 


CD 


f24 , fgS; ■ • • 
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We have now enough information to compute the corresponding key stream and 
compare it to pr, . . . ,pie- If all the values are equal then we output to, , tiQ 
as possible candidate for the internal state of the LFSR. Solving most of these 
equations is easy, since the equations are either linear over GF 28 or Z/{256). The 
only non-linear equations are the equations ®,®) and mi). These equations 
have the form 

{A + X)®{X AB) = C, 

where A,B,C are given and X is unknown. There are 2^^ possible combinati- 
ons for A, B and C, hence it is feasible to precompute the set of solutions of 
all equations and store the result. Then solving these non-linear equations re- 
quires only table look-ups. Moreover, there are totally 2^^ solutions for the 2^^ 
equations. Hence, we expect one solution on average for each of the equations 
(0,®, and uni). Therefore we may assume that testing each guess for the 9 
bytes to) Gj fsj fsj fi 2 ) ti 3 , ti 5 , ^22 and ^23 takes a constant number c of steps and 
thus the whole algorithms needs about c2^^ steps. 

If we know enough key stream then we can repeat the attack above for all 
tuples of 17 consecutive key stream bytes. Since a stutter byte is used for 3 key 
stream bytes on average, it follows that n -|- 17 consecutive key stream bytes 
contain on average n/3 sequences of 17 bytes of key stream where the first byte 
was generated with a new stutter bytes. The probability that such a sequence 
was generated starting with the two stutter bytes IOIIIIIO 2 and OIOIOIOI 2 is 
2-16 



Claim 3. There exists an algorithm that given n-|-16 consecutive key stream 
bytes of SOBER, can find the internal state of the LFSR in cn2^^ steps, with 
probability about 1 — (1 — where c denotes the number of steps for 

performing the innermost loop of our algorithm. 

Hence, we expect to find the internal state of SOBER after examining about 
n = 3 ■ 2^® bytes using about c2®® ® steps on average. Respectively, after c2®®-^ 
steps we have found the key with probability 1/2. Even though, this is still a 
huge number, the security of SOBER is nonetheless much lower than what we 
would expect from a cryptosystem with 128 bit keys. 

Remarks. Variants of the attack described in this section are possible and hence 
an attacker would have some flexibility. First, it can be noticed that the bytes 
P 7 , . . . ,pi 7 are only used in the last step of the attack to distinguish correct 
guesses of the LFSR state from wrong guesses. This differentiation would also 
be possible if other bytes than p 7 , . . . ,pi 7 are known. Additionally, statistical 
information on the values p 7 ,pg, . . . would be already sufficient to complete the 
attack. 

The algorithm we have described is based on the assumption that we can find 
key stream bytes pi, . . . produced with two stutter bytes being IOIIIIIO 2 and 
OIOIOIOI 2 . These values have the property that the resulting system of equati- 
ons is easily solvable. However, they are chosen somewhat arbitrarily from a set 
of equally well suited values. Other values would lead to a different system of 
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equations, with a different method to solve. For simplicity we haven’t described 
any such alternative attacks here. It should, however, be noted that these al- 
ternatives could possibly be used to reduce the number of necessary key stream 
bytes. 

6 Analysis of the Key Setup 

In this section, we analyze the key setup. During the key setup a key of length 40 
bit to 128 bit is expanded into an initial state of the LFSR register. Not all 2^^® 
states of the LFSR are therefore possible. Hence a desirable property of the key 
setup would be that the knowledge of the key setup procedure is not helpful for 
cryptanalysis. However, we show that SOBER leaves the LFSR in a state that 
can be easily described and that this information is helpful for cryptanalysis. 

In the following we will restrict ourselves to the analysis of the key setup for 
80 bit keys, since this key size is often recommended for application that do not 
require high security. Similar, attacks can be found for other key sizes too. 

Claim 4. Given 12 consecutive bytes of known plaintext and corresponding 
ciphertext encrypted with an unknown 80 bit key. Then it is possible to find the 
key in about c2®® operations with probability 9/16. 

The idea behind the attack is as follows: The attacker guesses 4 bits of the 
first stutter byte. This will give him two equations such as the ones described 
used in Sectional if the two bit pairs are different from 00. This happens with 
probability 9/16. Then he can guess 8 bytes of the internal state of the LFSR, 
such that he can compute 2 more bytes. The remaining state of the LFSR can 
now be found by using Equation ®. Finally, the solution has to be verified by 
computing more key stream bytes and comparing it to some known plaintext. 



7 Weak Keys 

In this section, we describe how to find keys that generate no key stream. We 
will call them weak keys. 

After the key and frame setup it can happen that the LFSR contains only 
O’s. In that case, all non-linear output bytes and therefore all stutter bytes are 
equal to zero. Hence the stream cipher will loop forever without generating a 
single byte of key stream. 

The probability that this event occurs is very low, i.e., for a randomly chosen 
key and frame number this will occur only with probability 2“^®®. 

The existence of such keys is no risk of privacy. However, the party that 
choses the keys might use this weakness to initiate a denial of service attack 
against another party. Generally, we recommend to check for such a state of the 
LFSR. One possible countermeasure is to reinitialize the stream cipher with a 
different frame number (e.g. 2®^ — 1. This would be sufficient countermeasure, 
since at most one frame for every key can show this behavior.) 
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Claim 5. There exists an efficient algorithm that given a frame number f, 
finds a key K, if such a key exists, such that the LFSR is in the zero state after 
initializing the key K and the frame number f. The probability that such a key 
exists for a randomly chosen frame number is larger than 1/256. 

Assume that we are given a frame number / and that we are looking for a key 
k such that the LFSR after the initialization is in the zero state. Since each of the 
steps during the initialization of the frame number is linear, these steps can easily 
be reversed. Next, we try to find the corresponding key. For simplicity, we are 
only interested in 128-bit keys. Given the state after the initialization of the key 
(i.e. n=40), the state of the LFSR after step n=17 can be computed by reversing 
the effect of cycling the register and hence S17, . . . , S33 can be computed. Since 
So, S 16 are also known, we can now easily compute the *-th key Ki byte from 



((175 (g) (si -I- Ki)) © Si+4 © (141 © Sj+is)) = Si+ 17 . 



Finally we have to check, whether adding the keylength 16 to Sie and cycling the 
register would produce the correct value for S33. We may assume that this will 
be the case with probability 1/256. The runtime of this algorithm is negligible. 
A few pairs of keys and corresponding frame numbers are given below. 



key frame 

85FA3E93F1993225E71B13EFC0811DAC 114 

34149FED30DEAC25D4FB89A0F0551DA7 12F 

FA1F9189BEE0A2128BA818165B83F86E 240 

376DA8DCF4632B0FD4A3EB745E3DB584 291 

8A7F49B63524B10FE78371BB4E09B57F 2AE 
956F2E347D8CC9F50F14C978C68E740B 530 

237DE6F06CE5BAEBD58040767F00D31A 5BE 
C7DC7B5B6FAFFCF1CFADB819493C4D77 63F 
5EE4BB2970 166 108F78CFE33374CAE94 7E5 



8 Conclusion 

We have shown different flaws in SOBER. We have implemented the most serious 
attack and shown that we can often recover the key in less than 1 minute on a 
Pentium/200Mhz. Greg Rose has developed a newer, still unpublished version 
of SOBER0. This version is based on commissioned cryptanalysis by Godes & 
Giphers Ltd. This work is unpublished, but partially mentioned in It seems 
that similar attacks to those described in Section 0 and Section 0 have been 
known found independently to our analysis. This new version of SOBER takes 
countermeasures against these attack. Moreover, these countermeasures seem 
to avoid the attack in Section 0 in this paper, though further analysis is still 
necessary. 
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